about summary refs log tree commit diff
path: root/frontend/src (unfollow)
Commit message (Collapse)AuthorFilesLines
2022-10-25Indexer: Use CrawledResource structure as values in the reverse index dbBaitinq3-11/+45
This will allow us to integrate priorities and other improvements.
2022-10-25Indexer: Add "correct" error handlingBaitinq1-7/+7
2022-10-25Crawler: Shuffle crawled urlsBaitinq3-4/+5
2022-10-25Crawler: Add "correct" error handlingBaitinq1-21/+23
2022-10-25Crawler: Parse urls with the "url" crateBaitinq3-25/+26
This fixes relative urls, makes url filtering and validation better, and many other improvements.
2022-10-24Crawler: Add crawled url filterBaitinq1-1/+8
This filters hrefs such as "/", "#" or "javascript:"
2022-10-24Flake: Add rust-analyzer packageBaitinq1-0/+1
2022-10-24Crawler: Set queue size to 2222Baitinq1-1/+1
2022-10-24Misc: Update build/run instructionsBaitinq1-2/+4
Now show how to run each module + yew frontend
2022-10-24Client->Frontend: Create yew frontend skeletonBaitinq8-14/+238
We have replaced the client with a yew frontend.
2022-10-23Crawler+Indexer: Rust cleanupBaitinq2-14/+6
Getting more familiar with the language so fixed some non optimal into_iter() usage, unnecessary .clone()s and unnecessary hack when we could just get a &mut for inserting into the indexer url database.
2022-10-23Crawler: Replace println! with dbg!Baitinq1-7/+7
2022-10-23Crawler: Remove prepending of https:// to each urlBaitinq2-1006/+1006
We now prepend it to the top-1000-urls list. This fixes crawled urls having two https://
2022-10-23Crawler: Only crawl 2 urls per urlBaitinq1-0/+6
This makes it so that we dont get rate limited from websites.
2022-10-23Crawler: Change blockingqueue to channelsBaitinq3-19/+45
We now use the async-channel channels implementation. This allows us to have bounded async channels.
2022-10-23Indexer: Listen on 0.0.0.0Baitinq1-1/+1
2022-10-22Indexer: Implement basic reverse index searching and addingBaitinq3-15/+163
Very inefficient but kind of functional:::)))))))
2022-10-22Crawler: Implement basic async functionalityBaitinq3-93/+285
2022-10-21Crawler: Add basic indexer communicationBaitinq2-11/+48
2022-10-21Indexer: Add skeleton http rest endpoint functionalityBaitinq3-1/+539
/search and /resource endpoint.
2022-10-20Crawler: Add Err string in the craw_url methodBaitinq1-3/+3
2022-10-20Crawler: Add indexer interaction skeletonBaitinq1-1/+5
2022-10-20Crawler: Wrap crawl response in Result typeBaitinq1-18/+23
2022-10-20Crawler: Normalise relative urlsBaitinq1-2/+17
We now normalise urls starting with / (relative to root) and // (relative to protocol)
2022-10-20Crawler: Remove duplicate parsed urlsBaitinq3-0/+20
2022-10-20Crawler: Add basic html parsing and link-followingBaitinq3-9/+1561
Extremely basic implementation. Needs max queue size, error handling, formatting of parsed links.
2022-10-20Crawler: Add skeleton crawler implementationBaitinq4-0/+1051
Starts by filling a queue with the top 1000 most visited sites. "Crawls" each one (empty fn), and blocks for new elements on the queue.
2022-10-19Misc: Change to use "oxalica/rust-overlay" for the nix development shellBaitinq3-26/+90
This fixes vscode not being able to find rust-analyzer and rust-src
2022-10-19Misc: Separate OSSE into componentsBaitinq9-10/+56
We now have a cargo workspace with the Crawler, Client and Indexer packages.