TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Vectors are over, hashes are the future

172 pointsby jsilversover 2 years ago

17 comments

fzliuover 2 years ago
Hashes are fine, but to say that &quot;vectors are over&quot; is just plain nonsense. We continue to see vectors as a core part of production systems for entity representation and recommendation (example: <a href="https:&#x2F;&#x2F;slack.engineering&#x2F;recommend-api" rel="nofollow">https:&#x2F;&#x2F;slack.engineering&#x2F;recommend-api</a>) and within models themselves (example: multimodal and diffusion models). For folks into metrics, we&#x27;re building a vector database specifically for storing, indexing, and searching across massive quantities of vectors (<a href="https:&#x2F;&#x2F;github.com&#x2F;milvus-io&#x2F;milvus" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;milvus-io&#x2F;milvus</a>), and we&#x27;ve seen close to exponential growth in terms of total downloads.<p>Vectors are just getting started.
评论 #33127304 未加载
评论 #33126752 未加载
评论 #33129086 未加载
评论 #33126158 未加载
评论 #33126261 未加载
gk1over 2 years ago
This is a rehash (pardon me) of this post from 2021: <a href="https:&#x2F;&#x2F;www.search.io&#x2F;blog&#x2F;vectors-versus-hashes" rel="nofollow">https:&#x2F;&#x2F;www.search.io&#x2F;blog&#x2F;vectors-versus-hashes</a><p>The demand for vector embedding models (like those released by OpenAI, Cohere, HuggingFace, etc) and vector databases (like <a href="https:&#x2F;&#x2F;pinecone.io" rel="nofollow">https:&#x2F;&#x2F;pinecone.io</a> -- disclosure: I work there) has only grown since then. The market has decided that vectors are not, in fact, over.
评论 #33126742 未加载
nelsondevover 2 years ago
Seems the author is proposing LSH instead of vectors for doing ANN?<p>There are benchmarks here, <a href="http:&#x2F;&#x2F;ann-benchmarks.com&#x2F;" rel="nofollow">http:&#x2F;&#x2F;ann-benchmarks.com&#x2F;</a> , but LSH underperforms the state of the art ANN algorithms like HNSW on recall&#x2F;throughput.<p>LSH I believe was state of the art 10ish years ago, but has since been surpassed. Although the caching aspect is really nice.
评论 #33126661 未加载
评论 #33126619 未加载
评论 #33125247 未加载
评论 #33125262 未加载
robotresearcherover 2 years ago
A state vector can represent a point in the state space of floating-point representation, a point in the state space of a hash function, or any other discrete space.<p>Vectors didn&#x27;t go anywhere. The article is discussing which function to use to interpret a vector.<p>Is there a special meaning of &#x27;vector&#x27; here that I am missing? Is it so synonymous in the ML context with &#x27;multidimensional floating point state space descriptor&#x27; that any other use is not a vector any more?
评论 #33128284 未加载
评论 #33126501 未加载
whatever1over 2 years ago
Omg NN “research” is just heuristics on top of heuristics on top of mambo jumbo.<p>Hopefully someone who knows math will enter the field one day and build the theoretical basis for all this mess and allow us to make real progress.
评论 #33127450 未加载
PLenzover 2 years ago
Hashes are just short, constrained membership vectors
mrkeenover 2 years ago
&gt; The analogy here would be the choice between a 1 second flight to somewhere random in the suburb of your choosing in any city in the world versus a 10 hour trip putting you at the exact house you wanted in the city of your choice.<p>Wouldn&#x27;t the first part of the analogy actually be:<p>A 1 second flight that will probably land at your exact destination, but could potentially land you anywhere on earth?
评论 #33129121 未加载
olliejover 2 years ago
So my interpretation of the neural hash approach is largely that it is essentially trading a much larger number of very small “neurons” vs a smaller number of floats. Given that I’d be curious about what the total size difference is.<p>I could see the hash approach at a functional level resulting in different features essentially getting a different number of bit directly, which be approximately equivalent to having a NN with variable precision floats, all in a very hand wavy way.<p>Eg we could say a NN&#x2F;NH needs N bits of information to work accurately, in which case you’re trading the format and operations on those Nbits
评论 #33130474 未加载
euphetarover 2 years ago
Very shallow article. Would like to see a list of mentioned &quot;recent breakthroughs&quot; about using hashes in ML besides the retrieval applications, because this is genuinely interesting
cratermoonover 2 years ago
And then there&#x27;s this: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33125640" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33125640</a>
whycombinetorover 2 years ago
The article&#x27;s 0.65 vs 0.66 float64 example doesn&#x27;t indicate much since neither 0.64 nor 0.65 have a terminating representation in base 2...
eterevskyover 2 years ago
So... Isn&#x27;t this just embeddings with 1 bit per value?<p>The natural question is: how are you going to train it?
tomrodover 2 years ago
Maybe I&#x27;m misunderstanding the guy, but he is effectively calling for lower dimensional mappings from vectors to hashes. That is fine and all, but aren&#x27;t hashes a single dimension in the way he is describing the use?
whimsicalismover 2 years ago
I work in this field and I found this article... very difficult to follow. More technical description would be helpful so I can pattern match to my existing knowledge.<p>Are they re-inventing autoencoders?
sramamover 2 years ago
(I know nothing about the area.)<p>Am I incorrect in thinking we are headed to future AIs that jump to conclusions? Or is it just my &quot;human neural hash&quot; being triggered in error?!
ummonkover 2 years ago
Such hashes <i>are</i> vectors over the Boolean field (with addition being bitwise XOR).
aaaaaaaaaaabover 2 years ago
Pfhew, I thought you wanted to ditch std::vector for hash maps!