Looking through HN posts over the last 30 days there are at least 15 posts for symantic search. I'm excited about the prospect but haven't seen good results.<p>Is there any info on query performance? And to builders, did you get better results by incorporating semantic search?
I helped develop a "semantic search" engine for patent and non-patent literature about 10 years ago which was highly successful, enough that our demo made a big sale on the very first day.<p>This engine used a neural network to train an autoencoder that crunches down the word counts for thousands of words to a moderate dimensional vector, say n=50. This captures correlations between words such that similar documents are more consistently close in the embedding space than they are in the very high dimensional word vector space.<p>This kind of system does not improve short queries (<10 words) but is great for "more like this" queries centered on a document and taking paragraph you wrote describing an invention and finding prior art.<p>We used the TREC evaluation methodology, public data, our proprietary data, and the opinions of users to conclude our product was much better than a simple baseline search engine and our competitors.
We’ve had good results with semantic search. We use it because keyword search doesn’t handle minor changes in words gracefully and semantic search does.