科技回声

1 comment

ggm超过 1 年前

I don't think this is entirely it.<p>Another important quality of a hash function is that you can re-create it. It becomes testable that the input makes the hash. Therefore the hash can stand both as a proof the input is "the same" seen another time, and be the short identity which itself distributes in a "semi random" manner. Its not fully random because given the input text anyone can derive the same hash. You're equating random to the distribution in the number field, but truly random things can't be repeated.<p>you focussed on the random quality, which goes to distribution of the hash as a key and the collision side of things, but the other side, being able to test the hash, implies access to the source, and an ability to run "the same" function to derive it.<p>hash collisions are contextual. If the ability to construct a collision is too low, then the hash becomes weaker. But it may not matter. An example here is that google photo hashes appear to be weak, because a small (sub fractional %) of people report seeing other people's photos in their library. ok, that does matter, its a breach of privacy. But at google scale, its noise (to them)<p>and most older hash-index models in C used to deal with hash collisions with a small serial walk to find the unique instance. the hash reduced lookup cost into a data structure but didnt actually guarantee uniqueness, it was contextual. Maybe more like sharding in modern terms?

An Intuitive Explanation of Hashing

1 comment

An Intuitive Explanation of Hashing

1 comment