about summary refs log tree commit diff
path: root/indexer/src (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Indexer: Stem words prior to adding/searching themBaitinq2022-11-111-4/+12
|
* Indexer: Decode html entities for website title and descriptionBaitinq2022-11-061-5/+2
| | | | Maybe we should do it for all the website's content too? :))
* Indexer: Add logging with env_loggerBaitinq2022-11-061-4/+8
|
* Indexer: Switch back to not serving frontend with actixBaitinq2022-11-051-11/+6
| | | | | | | This previously caused the frontend to be unresponsive when the crawler was passing results to the indexer. Now the frontend is again independently served by trunk and the api by actix, which makes them separate processes and the frontend can remain responsive.
* Indexer: Hold indexer lock for less time when in search endpointBaitinq2022-11-051-4/+2
|
* Indexer+Frontend: Integrate with actixBaitinq2022-11-051-3/+12
|
* Indexer: Actix: Use the same service handler with multiple routesBaitinq2022-11-051-11/+19
|
* Indexer: Add and use language field in IndexedResourceBaitinq2022-11-042-10/+28
|
* Indexer: Make & implement the trait insert() taking a [word] for insertBaitinq2022-11-042-27/+24
| | | | | | This has the advantage of taking less calls to the insert() and being able to add all the logic previous to inertion to the actual Indexer implementation.
* Indexer: Add missing /search/ routeBaitinq2022-11-031-1/+4
| | | | This previously caused no output when doing an empty search through the frontend.
* Lib+Indexer: Make IndexedResource title and description OptionalBaitinq2022-11-022-12/+20
|
* Indexer: Abstract indexerBaitinq2022-11-022-61/+132
| | | | | | | | We abstract an indexer's functionality into a trait (Indexer). We move the indexer specific code into the indexer_implementation.rs file. Im not sure if this causes a performance decrease. Should be investigated further.
* Misc: Cargo fmtBaitinq2022-10-301-2/+2
|
* Indexer: Use kuchiki to split html content into wordsBaitinq2022-10-301-6/+18
| | | | This is better than html2text when using non-ascii characters.
* Indexer: Transform all queries into lowercaseBaitinq2022-10-301-0/+3
| | | | | This is because currently then reverse index only contains lowercase words as it transforms them when inserting them.
* Misc: Add local lib crate to share common structsBaitinq2022-10-301-32/+1
|
* Crawler+Indexer+Frontend: Rename structs to follow logical relationsBaitinq2022-10-291-12/+19
| | | | | | Now Resource is CrawledResource as it is created by the crawler, and the previous CrawledResource is now IndexedResource as its created by the indexer.
* Indexer: Implement basic priority calculation of words in a siteBaitinq2022-10-291-7/+6
| | | | | We just calculate priority to be the number of occurences of the word in the site. This is very basic and should be changed:))
* Indexer: Add website title and description to the CrawledResourceBaitinq2022-10-281-1/+24
| | | | We now parse the HTML and extract the title and description of the site.
* Frontend: Refactor search_word_in_db() to not need explicit lifetimesBaitinq2022-10-281-6/+6
|
* Misc: Add TODOsBaitinq2022-10-281-0/+1
|
* Crawler: Abstract database word fetching with search_word_in_db()Baitinq2022-10-271-2/+10
|
* Indexer: Add /search with no query endpointBaitinq2022-10-271-0/+6
| | | | Just returns [].
* Indexer: Setup permissive CORSBaitinq2022-10-271-0/+3
|
* Indexer: Return json from the /search endpointBaitinq2022-10-271-7/+4
|
* Indexer: Use CrawledResource structure as values in the reverse index dbBaitinq2022-10-251-11/+43
| | | | This will allow us to integrate priorities and other improvements.
* Indexer: Add "correct" error handlingBaitinq2022-10-251-7/+7
|
* Crawler+Indexer: Rust cleanupBaitinq2022-10-231-11/+4
| | | | | | Getting more familiar with the language so fixed some non optimal into_iter() usage, unnecessary .clone()s and unnecessary hack when we could just get a &mut for inserting into the indexer url database.
* Indexer: Listen on 0.0.0.0Baitinq2022-10-231-1/+1
|
* Indexer: Implement basic reverse index searching and addingBaitinq2022-10-221-8/+81
| | | | Very inefficient but kind of functional:::)))))))
* Indexer: Add skeleton http rest endpoint functionalityBaitinq2022-10-211-1/+31
| | | | /search and /resource endpoint.
* Misc: Separate OSSE into componentsBaitinq2022-10-191-0/+3
We now have a cargo workspace with the Crawler, Client and Indexer packages.