about summary refs log tree commit diff
Commit message (Collapse)AuthorAgeFilesLines
* Crawler: Replace println! with dbg!Baitinq2022-10-231-7/+7
|
* Crawler: Remove prepending of https:// to each urlBaitinq2022-10-232-1006/+1006
| | | | | We now prepend it to the top-1000-urls list. This fixes crawled urls having two https://
* Crawler: Only crawl 2 urls per urlBaitinq2022-10-231-0/+6
| | | | This makes it so that we dont get rate limited from websites.
* Crawler: Change blockingqueue to channelsBaitinq2022-10-233-19/+45
| | | | | We now use the async-channel channels implementation. This allows us to have bounded async channels.
* Indexer: Listen on 0.0.0.0Baitinq2022-10-231-1/+1
|
* Indexer: Implement basic reverse index searching and addingBaitinq2022-10-223-15/+163
| | | | Very inefficient but kind of functional:::)))))))
* Crawler: Implement basic async functionalityBaitinq2022-10-223-93/+285
|
* Crawler: Add basic indexer communicationBaitinq2022-10-212-11/+48
|
* Indexer: Add skeleton http rest endpoint functionalityBaitinq2022-10-213-1/+539
| | | | /search and /resource endpoint.
* Crawler: Add Err string in the craw_url methodBaitinq2022-10-201-3/+3
|
* Crawler: Add indexer interaction skeletonBaitinq2022-10-201-1/+5
|
* Crawler: Wrap crawl response in Result typeBaitinq2022-10-201-18/+23
|
* Crawler: Normalise relative urlsBaitinq2022-10-201-2/+17
| | | | | We now normalise urls starting with / (relative to root) and // (relative to protocol)
* Crawler: Remove duplicate parsed urlsBaitinq2022-10-203-0/+20
|
* Crawler: Add basic html parsing and link-followingBaitinq2022-10-203-9/+1561
| | | | | Extremely basic implementation. Needs max queue size, error handling, formatting of parsed links.
* Crawler: Add skeleton crawler implementationBaitinq2022-10-204-0/+1051
| | | | | Starts by filling a queue with the top 1000 most visited sites. "Crawls" each one (empty fn), and blocks for new elements on the queue.
* Misc: Change to use "oxalica/rust-overlay" for the nix development shellBaitinq2022-10-193-26/+90
| | | | This fixes vscode not being able to find rust-analyzer and rust-src
* Misc: Separate OSSE into componentsBaitinq2022-10-199-10/+56
| | | | | We now have a cargo workspace with the Crawler, Client and Indexer packages.
* Initial Commit!Baitinq2022-10-1910-0/+136
This is the initial commit for this experiment of a search engine. I hope I can learn a lot from this!