TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

We relaunched similarity-search based on Y Combinator feedback. Thoughts?

4 pointsby eserorgalmost 17 years ago

3 comments

eserorgalmost 17 years ago
We received a lot of valuable feedback on our similarity-search engine, which we launched a few days ago.<p>Based on your feedback, we've made some major changes to improve recall. Specifically, we've begun to include data from our web-crawler.<p>We've also started to prune many of the similarity-search results in order to improve precision.<p>Finally, we cleaned-up the UI to make it more clear what the website does. I think that we still have some work to do in this area, however.<p>Unfortunately, many of the changes we've made to the algorithm have _dramatically_ slowed down performance. Most searches now take over a minute to complete!<p>We're hard at work on fixing that, though. Specifically, we're playing around with implementing multi-level counting bloom filters, count-min flajolet-martin sketches, and quntile fm digests.<p>We should have some major performance improvements up over the next few days.<p>We're also looking at launching a pre-alpha of a stand-alone software package that implements the ESer algorithm so that people can run similarity-searches on their own private data sets.<p>Please comment with your feedback.<p>Thanks again!
评论 #225703 未加载
dkasperalmost 17 years ago
It starts out pretty good, but gets kind of bizarre by the last 20 or so results. Pretty cool idea though!
评论 #225745 未加载
jakewolfalmost 17 years ago
Reminds me of adwords keyword tool. Fun. Good luck.