TechEcho

17 comments

fzliuover 2 years ago

Hashes are fine, but to say that "vectors are over" is just plain nonsense. We continue to see vectors as a core part of production systems for entity representation and recommendation (example: <a href="https://slack.engineering/recommend-api" rel="nofollow">https://slack.engineering/recommend-api</a>) and within models themselves (example: multimodal and diffusion models). For folks into metrics, we're building a vector database specifically for storing, indexing, and searching across massive quantities of vectors (<a href="https://github.com/milvus-io/milvus" rel="nofollow">https://github.com/milvus-io/milvus</a>), and we've seen close to exponential growth in terms of total downloads.Vectors are just getting started.

评论 #33127304 未加载

评论 #33126752 未加载

评论 #33129086 未加载

评论 #33126158 未加载

评论 #33126261 未加载

gk1over 2 years ago

This is a rehash (pardon me) of this post from 2021: <a href="https://www.search.io/blog/vectors-versus-hashes" rel="nofollow">https://www.search.io/blog/vectors-versus-hashes</a>The demand for vector embedding models (like those released by OpenAI, Cohere, HuggingFace, etc) and vector databases (like <a href="https://pinecone.io" rel="nofollow">https://pinecone.io</a> -- disclosure: I work there) has only grown since then. The market has decided that vectors are not, in fact, over.

评论 #33126742 未加载

nelsondevover 2 years ago

Seems the author is proposing LSH instead of vectors for doing ANN?There are benchmarks here, <a href="http://ann-benchmarks.com/" rel="nofollow">http://ann-benchmarks.com/</a> , but LSH underperforms the state of the art ANN algorithms like HNSW on recall/throughput.LSH I believe was state of the art 10ish years ago, but has since been surpassed. Although the caching aspect is really nice.

评论 #33126661 未加载

评论 #33126619 未加载

评论 #33125247 未加载

评论 #33125262 未加载

robotresearcherover 2 years ago

A state vector can represent a point in the state space of floating-point representation, a point in the state space of a hash function, or any other discrete space.Vectors didn't go anywhere. The article is discussing which function to use to interpret a vector.Is there a special meaning of 'vector' here that I am missing? Is it so synonymous in the ML context with 'multidimensional floating point state space descriptor' that any other use is not a vector any more?

评论 #33128284 未加载

评论 #33126501 未加载

whatever1over 2 years ago

Omg NN “research” is just heuristics on top of heuristics on top of mambo jumbo.Hopefully someone who knows math will enter the field one day and build the theoretical basis for all this mess and allow us to make real progress.

评论 #33127450 未加载

PLenzover 2 years ago

Hashes are just short, constrained membership vectors

mrkeenover 2 years ago

> The analogy here would be the choice between a 1 second flight to somewhere random in the suburb of your choosing in any city in the world versus a 10 hour trip putting you at the exact house you wanted in the city of your choice.Wouldn't the first part of the analogy actually be:A 1 second flight that will probably land at your exact destination, but could potentially land you anywhere on earth?

评论 #33129121 未加载

olliejover 2 years ago

So my interpretation of the neural hash approach is largely that it is essentially trading a much larger number of very small “neurons” vs a smaller number of floats. Given that I’d be curious about what the total size difference is.I could see the hash approach at a functional level resulting in different features essentially getting a different number of bit directly, which be approximately equivalent to having a NN with variable precision floats, all in a very hand wavy way.Eg we could say a NN/NH needs N bits of information to work accurately, in which case you’re trading the format and operations on those Nbits

评论 #33130474 未加载

euphetarover 2 years ago

Very shallow article. Would like to see a list of mentioned "recent breakthroughs" about using hashes in ML besides the retrieval applications, because this is genuinely interesting

cratermoonover 2 years ago

And then there's this: <a href="https://news.ycombinator.com/item?id=33125640" rel="nofollow">https://news.ycombinator.com/item?id=33125640</a>

whycombinetorover 2 years ago

The article's 0.65 vs 0.66 float64 example doesn't indicate much since neither 0.64 nor 0.65 have a terminating representation in base 2...

eterevskyover 2 years ago

So... Isn't this just embeddings with 1 bit per value?The natural question is: how are you going to train it?

tomrodover 2 years ago

Maybe I'm misunderstanding the guy, but he is effectively calling for lower dimensional mappings from vectors to hashes. That is fine and all, but aren't hashes a single dimension in the way he is describing the use?

whimsicalismover 2 years ago

I work in this field and I found this article... very difficult to follow. More technical description would be helpful so I can pattern match to my existing knowledge.Are they re-inventing autoencoders?

sramamover 2 years ago

(I know nothing about the area.)Am I incorrect in thinking we are headed to future AIs that jump to conclusions? Or is it just my "human neural hash" being triggered in error?!

ummonkover 2 years ago

Such hashes are vectors over the Boolean field (with addition being bitwise XOR).

aaaaaaaaaaabover 2 years ago

Pfhew, I thought you wanted to ditch std::vector for hash maps!

17 comments

fzliuover 2 years ago

评论 #33127304 未加载

评论 #33126752 未加载

评论 #33129086 未加载

评论 #33126158 未加载

评论 #33126261 未加载

gk1over 2 years ago

评论 #33126742 未加载

nelsondevover 2 years ago

评论 #33126661 未加载

评论 #33126619 未加载

评论 #33125247 未加载

评论 #33125262 未加载

robotresearcherover 2 years ago

评论 #33128284 未加载

评论 #33126501 未加载

whatever1over 2 years ago

评论 #33127450 未加载

PLenzover 2 years ago

Hashes are just short, constrained membership vectors

mrkeenover 2 years ago

评论 #33129121 未加载

olliejover 2 years ago

评论 #33130474 未加载

euphetarover 2 years ago

Very shallow article. Would like to see a list of mentioned "recent breakthroughs" about using hashes in ML besides the retrieval applications, because this is genuinely interesting

cratermoonover 2 years ago

And then there's this: <a href="https://news.ycombinator.com/item?id=33125640" rel="nofollow">https://news.ycombinator.com/item?id=33125640</a>

whycombinetorover 2 years ago

The article's 0.65 vs 0.66 float64 example doesn't indicate much since neither 0.64 nor 0.65 have a terminating representation in base 2...

eterevskyover 2 years ago

So... Isn't this just embeddings with 1 bit per value?The natural question is: how are you going to train it?

tomrodover 2 years ago

whimsicalismover 2 years ago

sramamover 2 years ago

(I know nothing about the area.)Am I incorrect in thinking we are headed to future AIs that jump to conclusions? Or is it just my "human neural hash" being triggered in error?!

ummonkover 2 years ago

Such hashes are vectors over the Boolean field (with addition being bitwise XOR).

aaaaaaaaaaabover 2 years ago

Pfhew, I thought you wanted to ditch std::vector for hash maps!

Vectors are over, hashes are the future

17 comments

Vectors are over, hashes are the future

17 comments