TechEcho

We received a lot of valuable feedback on our similarity-search engine, which we launched a few days ago.Based on your feedback, we've made some major changes to improve recall. Specifically, we've begun to include data from our web-crawler.We've also started to prune many of the similarity-search results in order to improve precision.Finally, we cleaned-up the UI to make it more clear what the website does. I think that we still have some work to do in this area, however.Unfortunately, many of the changes we've made to the algorithm have _dramatically_ slowed down performance. Most searches now take over a minute to complete!We're hard at work on fixing that, though. Specifically, we're playing around with implementing multi-level counting bloom filters, count-min flajolet-martin sketches, and quntile fm digests.We should have some major performance improvements up over the next few days.We're also looking at launching a pre-alpha of a stand-alone software package that implements the ESer algorithm so that people can run similarity-searches on their own private data sets.Please comment with your feedback.Thanks again!

It starts out pretty good, but gets kind of bizarre by the last 20 or so results. Pretty cool idea though!

Reminds me of adwords keyword tool. Fun. Good luck.

It starts out pretty good, but gets kind of bizarre by the last 20 or so results. Pretty cool idea though!

Reminds me of adwords keyword tool. Fun. Good luck.

We relaunched similarity-search based on Y Combinator feedback. Thoughts?

3 comments

We relaunched similarity-search based on Y Combinator feedback. Thoughts?

3 comments