TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

CityHash: new hash function by Google (faster than Murmur)

86 点作者 markerdmann超过 13 年前

11 条评论

coderdude超过 13 年前
From the readme: "On a single core of a 2.67GHz Intel Xeon X5550, CityHashCrc256 peaks at about 6.09 bytes/cycle. The other CityHashCrc functions are wrappers around CityHashCrc256 and should have similar performance on long strings. CityHash128 peaks at about 4.31 bytes/cycle."<p>I think it's fascinating that they're able to determine the number of bytes their code processes during a single cycle. Does anyone know where to start looking to learn more about how to do that? It reminds me of a comment I think I read on here a month or two ago about the number of pixels that could be processed per cycle with an algorithm implemented on a 500MHz FPGA.<p>Edit: Thanks for the answers!
评论 #3521836 未加载
评论 #3521712 未加载
评论 #3522294 未加载
评论 #3521704 未加载
ot超过 13 年前
Relevant comment on reddit by the author of MurmurHash:<p><i>Hi, I'm the author of MurmurHash (let me know if verification is needed) and helped review the design of CityHash (I work at Google now). CityHash has higher throughput and is more complex, MurmurHash has lower latency for very short keys and is relatively simple. Both are more than adequate for any sort of hashing needs you might have and should be virtually identical to a random oracle for any keysets you might throw at them. I'm happy to answer questions if you have any.</i><p><a href="http://www.reddit.com/r/programming/comments/ozodk/cityhash_new_hash_function_by_google_faster_than/c3lg3ih" rel="nofollow">http://www.reddit.com/r/programming/comments/ozodk/cityhash_...</a>
gioele超过 13 年前
Is there an article that describes the algorithm in more abstract terms?<p>The C++ code is quite readable but I'd say that it contains too many magic numbers. Code is no substitute for a clear description given by the authors of the algorithm, its tradeoffs and design choices.
Groxx超过 13 年前
Whenever I see C code like this:<p><pre><code> #define CHUNK(multiplier, z) \ { \ uint64 old_a = a; \ a = Rotate(b, 41 ^ z) * multiplier + Fetch64(s); \ b = Rotate(c, 27 ^ z) * multiplier + Fetch64(s + 8); \ c = Rotate(d, 41 ^ z) * multiplier + Fetch64(s + 16); \ ... etc </code></pre> I wonder: isn't this the kind of thing that's better handled by creating a function, and <i>maybe</i> explicitly telling the compiler to inline it? I mean, you can do this (from here: <a href="http://gcc.gnu.org/onlinedocs/gcc/Inline.html" rel="nofollow">http://gcc.gnu.org/onlinedocs/gcc/Inline.html</a> ):<p><pre><code> inline void foo (const char) __attribute__((always_inline)); </code></pre> or you could simply let your optimizer take care of it. It may be faster to <i>not</i> inline it in some circumstances, but you've <i>gone out of your way</i> to prevent it from being able to make that decision, and to make your code harder to read and more prone to problems due to macro expansion.<p>Why? Is it habit? Or is there a good reason? (currently. historical reasons are not good reasons to continue doing things.)
评论 #3524196 未加载
casca超过 13 年前
Here's the announcement from Google in April last year: <a href="http://google-opensource.blogspot.com/2011/04/introducing-cityhash.html" rel="nofollow">http://google-opensource.blogspot.com/2011/04/introducing-ci...</a><p>The most important thing to take away from this is "These functions aren’t suitable for cryptography". The requirements for a cryptographically strong hash function are much stronger than one just used for hash-table performance. That's not to say that you shouldn't use it, rather be careful that there's no scope creep.
cmurphycode超过 13 年前
The page outright says that it is not a cryptographic hash, but the Hash Quality section doesn't expound on why (except for a minor flaw on short strings for one of the functions). Does anyone know what the expected collision rate for this hash is?<p>In other words, I don't care if the hash doesn't avalanche perfectly (half changed for one bit of input change; a requirement for cryptographic hashing so changes are "obvious") as long as the collision rate is ~SHA256. Or, if I <i>should</i> care, explanations would be wonderful.
评论 #3522141 未加载
评论 #3522265 未加载
DiabloD3超过 13 年前
Sure, its faster, but is it well distributed and has no known immediate flaws? I understand its not cryptographically secure, but does it perform as well as Murmurhash3 in those areas?
评论 #3521744 未加载
vilda超过 13 年前
Looking at the code it looks like it's heavily optimized for x86 family of processors.<p>What is suitable option for ARM processors? Similar Murmur hash depends on ultra-fast multiplication unit (which consumes a lot of energy).
nashby超过 13 年前
BTW, here is a ruby binding for this hash function <a href="https://github.com/nashby/cityhash" rel="nofollow">https://github.com/nashby/cityhash</a>
tripzilch超过 13 年前
Any reason why I should use this instead of the (very simple to implement, very good avalanche behaviour) FNV hash?
评论 #3523324 未加载
nerdo超过 13 年前
Is this really new? Initial check-in is July 22nd on the source tab.
评论 #3521930 未加载