Hashes are fine, but to say that "vectors are over" is just plain nonsense. We continue to see vectors as a core part of production systems for entity representation and recommendation (example: <a href="https://slack.engineering/recommend-api" rel="nofollow">https://slack.engineering/recommend-api</a>) and within models themselves (example: multimodal and diffusion models). For folks into metrics, we're building a vector database specifically for storing, indexing, and searching across massive quantities of vectors (<a href="https://github.com/milvus-io/milvus" rel="nofollow">https://github.com/milvus-io/milvus</a>), and we've seen close to exponential growth in terms of total downloads.<p>Vectors are just getting started.
This is a rehash (pardon me) of this post from 2021: <a href="https://www.search.io/blog/vectors-versus-hashes" rel="nofollow">https://www.search.io/blog/vectors-versus-hashes</a><p>The demand for vector embedding models (like those released by OpenAI, Cohere, HuggingFace, etc) and vector databases (like <a href="https://pinecone.io" rel="nofollow">https://pinecone.io</a> -- disclosure: I work there) has only grown since then. The market has decided that vectors are not, in fact, over.
Seems the author is proposing LSH instead of vectors for doing ANN?<p>There are benchmarks here, <a href="http://ann-benchmarks.com/" rel="nofollow">http://ann-benchmarks.com/</a> , but LSH underperforms the state of the art ANN algorithms like HNSW on recall/throughput.<p>LSH I believe was state of the art 10ish years ago, but has since been surpassed. Although the caching aspect is really nice.
A state vector can represent a point in the state space of floating-point representation, a point in the state space of a hash function, or any other discrete space.<p>Vectors didn't go anywhere. The article is discussing which function to use to interpret a vector.<p>Is there a special meaning of 'vector' here that I am missing? Is it so synonymous in the ML context with 'multidimensional floating point state space descriptor' that any other use is not a vector any more?
Omg NN “research” is just heuristics on top of heuristics on top of mambo jumbo.<p>Hopefully someone who knows math will enter the field one day and build the theoretical basis for all this mess and allow us to make real progress.
> The analogy here would be the choice between a 1 second flight to somewhere random in the suburb of your choosing in any city in the world versus a 10 hour trip putting you at the exact house you wanted in the city of your choice.<p>Wouldn't the first part of the analogy actually be:<p>A 1 second flight that will probably land at your exact destination, but could potentially land you anywhere on earth?
So my interpretation of the neural hash approach is largely that it is essentially trading a much larger number of very small “neurons” vs a smaller number of floats. Given that I’d be curious about what the total size difference is.<p>I could see the hash approach at a functional level resulting in different features essentially getting a different number of bit directly, which be approximately equivalent to having a NN with variable precision floats, all in a very hand wavy way.<p>Eg we could say a NN/NH needs N bits of information to work accurately, in which case you’re trading the format and operations on those Nbits
Very shallow article. Would like to see a list of mentioned "recent breakthroughs" about using hashes in ML besides the retrieval applications, because this is genuinely interesting
And then there's this: <a href="https://news.ycombinator.com/item?id=33125640" rel="nofollow">https://news.ycombinator.com/item?id=33125640</a>
Maybe I'm misunderstanding the guy, but he is effectively calling for lower dimensional mappings from vectors to hashes. That is fine and all, but aren't hashes a single dimension in the way he is describing the use?
I work in this field and I found this article... very difficult to follow. More technical description would be helpful so I can pattern match to my existing knowledge.<p>Are they re-inventing autoencoders?
(I know nothing about the area.)<p>Am I incorrect in thinking we are headed to future AIs that jump to conclusions?
Or is it just my "human neural hash" being triggered in error?!