Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | Frontend: Fetch results from indexer | Baitinq | 2022-10-27 | 3 | -33/+53 | |
| | ||||||
* | Crawler: Abstract database word fetching with search_word_in_db() | Baitinq | 2022-10-27 | 1 | -2/+10 | |
| | ||||||
* | Indexer: Add /search with no query endpoint | Baitinq | 2022-10-27 | 1 | -0/+6 | |
| | | | | Just returns []. | |||||
* | Crawler: Replace String::from with .to_string() | Baitinq | 2022-10-27 | 1 | -3/+6 | |
| | ||||||
* | Indexer: Setup permissive CORS | Baitinq | 2022-10-27 | 3 | -1/+21 | |
| | ||||||
* | Indexer: Return json from the /search endpoint | Baitinq | 2022-10-27 | 3 | -7/+6 | |
| | ||||||
* | Frontend: Add results field to the state and set dummy results | Baitinq | 2022-10-26 | 1 | -2/+46 | |
| | ||||||
* | Frontend: Add basic search_query state | Baitinq | 2022-10-26 | 3 | -8/+158 | |
| | ||||||
* | Frontend: Add basic layout | Baitinq | 2022-10-26 | 2 | -1/+43 | |
| | ||||||
* | Frontend: Update index.html to include boostrap | Baitinq | 2022-10-25 | 1 | -3/+16 | |
| | | | | Also setup viewport and title. | |||||
* | Crawler: Fix bad error handling with match handling | Baitinq | 2022-10-25 | 1 | -6/+9 | |
| | ||||||
* | Crawler: Use async Client | Baitinq | 2022-10-25 | 4 | -48/+152 | |
| | ||||||
* | Indexer: Use CrawledResource structure as values in the reverse index db | Baitinq | 2022-10-25 | 3 | -11/+45 | |
| | | | | This will allow us to integrate priorities and other improvements. | |||||
* | Indexer: Add "correct" error handling | Baitinq | 2022-10-25 | 1 | -7/+7 | |
| | ||||||
* | Crawler: Shuffle crawled urls | Baitinq | 2022-10-25 | 3 | -4/+5 | |
| | ||||||
* | Crawler: Add "correct" error handling | Baitinq | 2022-10-25 | 1 | -21/+23 | |
| | ||||||
* | Crawler: Parse urls with the "url" crate | Baitinq | 2022-10-25 | 3 | -25/+26 | |
| | | | | | This fixes relative urls, makes url filtering and validation better, and many other improvements. | |||||
* | Crawler: Add crawled url filter | Baitinq | 2022-10-24 | 1 | -1/+8 | |
| | | | | This filters hrefs such as "/", "#" or "javascript:" | |||||
* | Flake: Add rust-analyzer package | Baitinq | 2022-10-24 | 1 | -0/+1 | |
| | ||||||
* | Crawler: Set queue size to 2222 | Baitinq | 2022-10-24 | 1 | -1/+1 | |
| | ||||||
* | Misc: Update build/run instructions | Baitinq | 2022-10-24 | 1 | -2/+4 | |
| | | | | Now show how to run each module + yew frontend | |||||
* | Client->Frontend: Create yew frontend skeleton | Baitinq | 2022-10-24 | 8 | -14/+238 | |
| | | | | We have replaced the client with a yew frontend. | |||||
* | Crawler+Indexer: Rust cleanup | Baitinq | 2022-10-23 | 2 | -14/+6 | |
| | | | | | | Getting more familiar with the language so fixed some non optimal into_iter() usage, unnecessary .clone()s and unnecessary hack when we could just get a &mut for inserting into the indexer url database. | |||||
* | Crawler: Replace println! with dbg! | Baitinq | 2022-10-23 | 1 | -7/+7 | |
| | ||||||
* | Crawler: Remove prepending of https:// to each url | Baitinq | 2022-10-23 | 2 | -1006/+1006 | |
| | | | | | We now prepend it to the top-1000-urls list. This fixes crawled urls having two https:// | |||||
* | Crawler: Only crawl 2 urls per url | Baitinq | 2022-10-23 | 1 | -0/+6 | |
| | | | | This makes it so that we dont get rate limited from websites. | |||||
* | Crawler: Change blockingqueue to channels | Baitinq | 2022-10-23 | 3 | -19/+45 | |
| | | | | | We now use the async-channel channels implementation. This allows us to have bounded async channels. | |||||
* | Indexer: Listen on 0.0.0.0 | Baitinq | 2022-10-23 | 1 | -1/+1 | |
| | ||||||
* | Indexer: Implement basic reverse index searching and adding | Baitinq | 2022-10-22 | 3 | -15/+163 | |
| | | | | Very inefficient but kind of functional:::))))))) | |||||
* | Crawler: Implement basic async functionality | Baitinq | 2022-10-22 | 3 | -93/+285 | |
| | ||||||
* | Crawler: Add basic indexer communication | Baitinq | 2022-10-21 | 2 | -11/+48 | |
| | ||||||
* | Indexer: Add skeleton http rest endpoint functionality | Baitinq | 2022-10-21 | 3 | -1/+539 | |
| | | | | /search and /resource endpoint. | |||||
* | Crawler: Add Err string in the craw_url method | Baitinq | 2022-10-20 | 1 | -3/+3 | |
| | ||||||
* | Crawler: Add indexer interaction skeleton | Baitinq | 2022-10-20 | 1 | -1/+5 | |
| | ||||||
* | Crawler: Wrap crawl response in Result type | Baitinq | 2022-10-20 | 1 | -18/+23 | |
| | ||||||
* | Crawler: Normalise relative urls | Baitinq | 2022-10-20 | 1 | -2/+17 | |
| | | | | | We now normalise urls starting with / (relative to root) and // (relative to protocol) | |||||
* | Crawler: Remove duplicate parsed urls | Baitinq | 2022-10-20 | 3 | -0/+20 | |
| | ||||||
* | Crawler: Add basic html parsing and link-following | Baitinq | 2022-10-20 | 3 | -9/+1561 | |
| | | | | | Extremely basic implementation. Needs max queue size, error handling, formatting of parsed links. | |||||
* | Crawler: Add skeleton crawler implementation | Baitinq | 2022-10-20 | 4 | -0/+1051 | |
| | | | | | Starts by filling a queue with the top 1000 most visited sites. "Crawls" each one (empty fn), and blocks for new elements on the queue. | |||||
* | Misc: Change to use "oxalica/rust-overlay" for the nix development shell | Baitinq | 2022-10-19 | 3 | -26/+90 | |
| | | | | This fixes vscode not being able to find rust-analyzer and rust-src | |||||
* | Misc: Separate OSSE into components | Baitinq | 2022-10-19 | 9 | -10/+56 | |
| | | | | | We now have a cargo workspace with the Crawler, Client and Indexer packages. | |||||
* | Initial Commit! | Baitinq | 2022-10-19 | 10 | -0/+136 | |
This is the initial commit for this experiment of a search engine. I hope I can learn a lot from this! |