Hey HN, I'm building an open source search platform that lives on your device,
indexing what you want, exposing it to you in a super simple & super fast interface.<p>I took the idea of adding "site:reddit.com" to your Google searches and expanded on it with the idea of "lenses" to add context to your search query and give the crawler direction in terms of what to crawl & index. This means that all queries are run locally, it does not relay your search to any 3rd-party search engine. Think of it as your personal bookcase at home vs. the Library of Congress.<p>It's still in a super early state but would love for people to start using it and
providing some feedback and see what sort of lenses people want to build and search through!<p>Some details about the stack for the interested:<p><pre><code> * All Rust w/ some HTML/CSS for the client.
* Client is built w/ yew + tauri
* Backend uses tantivy to index the web pages, sqlite3 to hold metadata / crawl queue
</code></pre>
Thanks in advance!
Cool.<p>But a warning, based on doing quite a lot of crawling from home through my own search engine, it's very easy to have your IP or IP-block end up on annoying graylists where basically every other website you visit will throw a CAPTCHA in your face. I'm aware this is a risk and use a VPN for most of my private web surfing anyway so it's not that much of a bother, but it's a bit sketchy to expose other people to that risk through something like this.<p>It would probably be wise to use canned crawls for major websites, maybe something like trading WARCs <<a href="https://en.wikipedia.org/wiki/Web_ARChive" rel="nofollow">https://en.wikipedia.org/wiki/Web_ARChive</a>> over bit-torrent or whatever. Most of these types of websites don't change <i>that</i> often in the places that matter.