TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Optimizing Rabin-Karp Hashing

74 点作者 mis1988大约 1 年前

5 条评论

stabbles大约 1 年前
Regarding &quot;A final optimization&quot;, there&#x27;s another way:<p>`_mm256_cmpeq_epi32` produces `0xFFFFFFFF`, the article suggests shifting to produce a `1` and then add.<p>Instead you can interpret `0xFFFFFFFF` as negative one, and subtract. That saves a shift.<p>Flip the sign when accumulating.<p>In general I think this is a pretty common counting trick. I don&#x27;t think those shift operations even exists for epi8, so there you really need to use it to avoid reduction to a narrow register. Also, in the case of epi8 you need to deal with overflow, so the pattern is like this in pseudo code:<p><pre><code> v[1:32] = 0 total = 0 for j = 0 to N &#x2F; 256 for i = 0 to 255 v[1:32] -= cmpeq(..., ...) end total += sum(v) end</code></pre>
评论 #39653296 未加载
pvg大约 1 年前
Short related discussion a month ago: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39254176">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=39254176</a>
perihelions大约 1 年前
I wonder what a designed-for-SIMD rolling hash would look like. Like—just a trivial toy experiment—suppose hash ↤ (hash * input[j]) &lt;&lt; 1. That&#x27;s a (very crappy) rolling-hash function with a window size of sizeof(hash). It&#x27;s also algebraically equivalent to hash ↤ hash * (input[j] &lt;&lt; 1), which is a parallelizable horizontal-product over the input bytevector. This shows that you can keep the &quot;rolling window&quot; property, without the &quot;evaluates serially from left-to-right&quot; requirement: it&#x27;s possible to rearrange things algebraically so that the &quot;folding&quot; step can be evaluated in any order.<p>Are there nontrivial constructions with this property?
EdSchouten大约 1 年前
Are there actually practical cases where Rabin-Karp hashing is what dominates the running time of an application? The naive implementation already gives you 0.75 GB&#x2F;s. Seems pretty fast.
评论 #39650366 未加载
评论 #39650376 未加载
评论 #39650204 未加载
mis1988大约 1 年前
A response to a question posed by Daniel Lemire.