I actually half wrote a RFC of a spec and 2 implementations of a federated search last year. Rather than do the disturbed hash table that yacy does.<p>I wanted results to be re-rankable by the peers by sharing the scores that went into them. The idea being with a common protocol based on the ideas of ActivityPub you could get peers of searches working together to hopefully surface interesting things.<p>Something I should probably finish and publish at some point. It worked to the hundreds of peers I tested.<p>The reason I mention this is because I wanted to also add a front into yacy which tuned out to be harder than I expected. It’s a wonderful project and you can find great stuff through it but the way the peers return results sometimes it’s hard to find it again. It’s also not quite as hackable as I would have hoped at the time probably due to he project age.<p>I still think there is value in it though and I’d love to see yacy have its protocol explained as an apex so people could,build implementations in other languages more easily.
Long time ago I worked for a startup called Wowd which built distributed search engine. It was acquihired by Facebook.<p>On of the biggest issues was how to entice people to download and run the client/node.<p>I half wondered afterwards if slapping some crypto on top of it which would be mined by running the node and providing resources would help. My gut says easy yes, but my mind grimace at the abomination.
Previously:<p>YaCy – your own search engine | <a href="https://news.ycombinator.com/item?id=32597309">https://news.ycombinator.com/item?id=32597309</a> | 2 years ago | 93 comments<p>YaCy: Decentralized Web Search | <a href="https://news.ycombinator.com/item?id=22246732">https://news.ycombinator.com/item?id=22246732</a> | 4 years ago | 41 comments<p>YaCy – The Peer to Peer Search Engine | <a href="https://news.ycombinator.com/item?id=17089240">https://news.ycombinator.com/item?id=17089240</a> | 6 years ago | 3 comments<p>YaCy: a free distributed search engine | <a href="https://news.ycombinator.com/item?id=12433010">https://news.ycombinator.com/item?id=12433010</a> | 8 years ago | 24 comments<p>YaCy: Decentralized Web Search | <a href="https://news.ycombinator.com/item?id=8746883">https://news.ycombinator.com/item?id=8746883</a> | 9 years ago | 29 comments<p>YaCy takes on Google with open source search engine | <a href="https://news.ycombinator.com/item?id=3288586">https://news.ycombinator.com/item?id=3288586</a> | 12 years ago | 17 comments
There are already many project about search:<p>- <a href="https://www.marginalia.nu/" rel="nofollow">https://www.marginalia.nu/</a><p>- <a href="https://searchmysite.net/" rel="nofollow">https://searchmysite.net/</a><p>- <a href="https://lucene.apache.org/" rel="nofollow">https://lucene.apache.org/</a><p>- elastic search<p>- <a href="https://presearch.com/" rel="nofollow">https://presearch.com/</a><p>- <a href="https://stract.com/" rel="nofollow">https://stract.com/</a><p>- <a href="https://wiby.me/" rel="nofollow">https://wiby.me/</a><p>I think that all project are fun. I would like to see one succeeding at reaching mainstream level of attention.<p>I have also been gathering links meta data for some time. Maybe I will use them to feed any eventual self hosted search engine, or language model, if I decide to experiment with that.<p>- domains for seed <a href="https://github.com/rumca-js/Internet-Places-Database">https://github.com/rumca-js/Internet-Places-Database</a><p>- bookmarks seed <a href="https://github.com/rumca-js/RSS-Link-Database">https://github.com/rumca-js/RSS-Link-Database</a><p>- links for year <a href="https://github.com/rumca-js/RSS-Link-Database-2024">https://github.com/rumca-js/RSS-Link-Database-2024</a>
Yacy's still around. Nice.<p>After a year or two of hosting a Yacy instance (2014?) I started winding up on some general (probes, etc) blacklists.<p>I also host a small mail server and I was getting mail returned. I'd force an IP swap and a few weeks later it'd be the same. I had to let Yacy go.
Has it gotten any better recently?<p>I run a node but I haven’t actually used it as a search engine in a while, as I found the result quality to be exceedingly poor.
Are the results still being gamed by sites using content keyword stuffing? The last time I used it the searching and ranking technology felt like they were 40 years behind state of the art.
I once went to a workshop on a Sunday morning at the local makerspace to listen to someone talk about some kind of distributed search engine or something like that. One of the developers came from (I think) Germany to explain this to us the centralized sheeple. He just gave a demonstration of the thing, like here is the box you type stuff and here are the results. When I started to ask questions about how it worked an all he sort of acted annoyed saying it was all too difficult to explain. This was more than ten years ago, and yes I am still angry about it.
Nice to see search projects are still popping up. After a move, family life taking over and me getting more interested in Unreal Engine, my poor search engine is now more of an experiment in seeing how well it runs while basically on life-support maintenance updates I do. Starting to think I honestly should just take it down and save my $50 a month I spend maintaining it.<p>But I'll post it in a hacker news comment and maybe you all will give it enough traffic I can get excited about it again, lol<p><a href="https://www.unscatter.com" rel="nofollow">https://www.unscatter.com</a>
I've been using several times over the last decades and never got good results. I think one instance is still running on my old computer at uni :-)
I ran YaCy for a while, but not as a node on their distributed search index. I just ran it as a search engine for all my own bookmarks. Unfortunately I never found a particularly good way of getting bookmarks into the system. So eventually I shut it down. Cool idea in theory though.
Related to this — I’d love to see individuals making web pages again, and federated search engines indexing them. People don’t make their own hobby or fan or art websites anymore, and I think that’s partly because nobody will every find them with the big search engines.
See also: Presearch, another decentralized search engine, claimed that it will be open source. No source code available at the moment though.<p><a href="https://presearch.com/" rel="nofollow">https://presearch.com/</a>
If you run YaCy with docker and it is still a junior peer, does the search return results from the global index or just the one that appears to be 'preinstalled'?
<p><pre><code> curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
</code></pre>
Can't seem to access the page.
Sort of hijacking the thread to ask, can YaCy or similar, be an alternative to Google's Programmable Search Engine? All I use it for is limit a search to a medium-sized list of domains. The aspect that makes running a search engine difficult on your own is lack of resources for crawling, I expect. But since I only care about a small list of domains, could I ditch Google's and run my own crawler like YaCy?