TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Hash tables

139 pointsby milsebgover 5 years ago

4 comments

dragontamerover 5 years ago
High-performance Hash table design is pretty interesting, because modern CPU architecture has grossly changed their performance characteristics.<p>In the 90s, it was (probably) faster to do linked-list &#x2F; chaining as a collision resolution mechanism. But because L1 caches are so fast on today&#x27;s machines (and DDR4 remains very high latency), linear probing seems to be the winner on modern machines.<p>Consider that DDR4 RAM is maybe 200-clock cycles to access (50ns on a 4GHz modern machine). L1 cache can be accessed within 4 clock cycles. Fetching an entire cache line is like guessing on 8-positions in linear probing.<p>The if&#x2F;else statements will execute all within single-digit nanoseconds (Modern CPUs execute at 4GHz, and can execute more than 1 instruction per clock tick... up to 6-instructions per tick if all instructions are in uop cache and are properly branch-predicted), long before the 2nd location (ex: the 2nd member in a Linked List) can be even accessed.<p>Furthermore, modern CPUs will perform pre-fetching when they notice code traversing through memory linearly. As such, the 200-clock cycle latency is &quot;pipelined&quot; and performed in parallel in practice. Your &quot;effective latency&quot; drops significantly when you traverse linearly.<p>EDIT: And finally: staying on the same DDR4 row (Roughly 1024 bytes) means that you only have to perform a Column-read command over and over again. If you leave a DDR4 Row, the RAM needs to precharge, open the new row, and then perform a new column read. Takes roughly 3x longer than reading from an already open column.
评论 #21219414 未加载
评论 #21219933 未加载
评论 #21224222 未加载
评论 #21219289 未加载
评论 #21220409 未加载
saagarjhaover 5 years ago
&gt; A hash function is a one-way function that produces the same output for a given input. It’s one-way in the sense that, given the hash function, it should be difficult to convert the output back to the input without trial and error.<p>Usually I like to say that they convert an input into a uniform distribution of fixed length output, since reversibility isn’t really important if you using it for a non-cryptographic purpose such as this one.
评论 #21218853 未加载
评论 #21219353 未加载
sdegutisover 5 years ago
Also <a href="http:&#x2F;&#x2F;craftinginterpreters.com&#x2F;hash-tables.html" rel="nofollow">http:&#x2F;&#x2F;craftinginterpreters.com&#x2F;hash-tables.html</a>
rurbanover 5 years ago
&gt; SipHash is a relatively fast hash function.<p>Oh my! SipHash is by far the slowest of all practical hash functions. <a href="https:&#x2F;&#x2F;github.com&#x2F;rurban&#x2F;smhasher#smhasher" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;rurban&#x2F;smhasher#smhasher</a>
评论 #21222260 未加载