about summary refs log tree commit diff
path: root/crawler/Cargo.toml (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Misc: Remove unneeded dependenciesBaitinq2022-10-301-1/+0
|
* Misc: Add local lib crate to share common structsBaitinq2022-10-301-0/+1
|
* Crawler: Use async ClientBaitinq2022-10-251-1/+1
|
* Crawler: Shuffle crawled urlsBaitinq2022-10-251-1/+2
|
* Crawler: Parse urls with the "url" crateBaitinq2022-10-251-0/+1
| | | | | This fixes relative urls, makes url filtering and validation better, and many other improvements.
* Crawler: Change blockingqueue to channelsBaitinq2022-10-231-1/+1
| | | | | We now use the async-channel channels implementation. This allows us to have bounded async channels.
* Crawler: Implement basic async functionalityBaitinq2022-10-221-0/+1
|
* Crawler: Add basic indexer communicationBaitinq2022-10-211-1/+2
|
* Crawler: Remove duplicate parsed urlsBaitinq2022-10-201-0/+1
|
* Crawler: Add basic html parsing and link-followingBaitinq2022-10-201-0/+2
| | | | | Extremely basic implementation. Needs max queue size, error handling, formatting of parsed links.
* Crawler: Add skeleton crawler implementationBaitinq2022-10-201-0/+1
| | | | | Starts by filling a queue with the top 1000 most visited sites. "Crawls" each one (empty fn), and blocks for new elements on the queue.
* Misc: Separate OSSE into componentsBaitinq2022-10-191-0/+12
We now have a cargo workspace with the Crawler, Client and Indexer packages.