Right now ANN is a huge (and fascinating) area for active data structures/Algos research.<p>I do sort of wonder if we will reach a point where instead of pure recall per speed, we’ll have a family of algorithms with more trade offs for the use case. Where we begin to look at metrics for that domain for a given ANN approach instead of just its ability to recreate a nearest neighbors set retrieval.<p>Like one thing I see assumed is that while recall at N is good, does this also mean the “ranking” of that top N is ideal? I don’t want to have to manually compute NN on this top N if I can avoid it, for example.<p>And are there specific vector spaces where one ANN is preferred? Or will there be some universal approach that just works for everything?<p>I realize it’s too early to tell, but these questions always percolate in my mind when we hit a new benchmark in recall for speed. Especially since I still see people doing more naive things that seem to work perfectly fine for their use case (like KD trees, random projections, or LSH)