TechEcho

4 comments

High-performance Hash table design is pretty interesting, because modern CPU architecture has grossly changed their performance characteristics.In the 90s, it was (probably) faster to do linked-list / chaining as a collision resolution mechanism. But because L1 caches are so fast on today's machines (and DDR4 remains very high latency), linear probing seems to be the winner on modern machines.Consider that DDR4 RAM is maybe 200-clock cycles to access (50ns on a 4GHz modern machine). L1 cache can be accessed within 4 clock cycles. Fetching an entire cache line is like guessing on 8-positions in linear probing.The if/else statements will execute all within single-digit nanoseconds (Modern CPUs execute at 4GHz, and can execute more than 1 instruction per clock tick... up to 6-instructions per tick if all instructions are in uop cache and are properly branch-predicted), long before the 2nd location (ex: the 2nd member in a Linked List) can be even accessed.Furthermore, modern CPUs will perform pre-fetching when they notice code traversing through memory linearly. As such, the 200-clock cycle latency is "pipelined" and performed in parallel in practice. Your "effective latency" drops significantly when you traverse linearly.EDIT: And finally: staying on the same DDR4 row (Roughly 1024 bytes) means that you only have to perform a Column-read command over and over again. If you leave a DDR4 Row, the RAM needs to precharge, open the new row, and then perform a new column read. Takes roughly 3x longer than reading from an already open column.

评论 #21219414 未加载

评论 #21219933 未加载

评论 #21224222 未加载

评论 #21219289 未加载

评论 #21220409 未加载

saagarjhaover 5 years ago

> A hash function is a one-way function that produces the same output for a given input. It’s one-way in the sense that, given the hash function, it should be difficult to convert the output back to the input without trial and error.Usually I like to say that they convert an input into a uniform distribution of fixed length output, since reversibility isn’t really important if you using it for a non-cryptographic purpose such as this one.

评论 #21218853 未加载

评论 #21219353 未加载

sdegutisover 5 years ago

Also <a href="http://craftinginterpreters.com/hash-tables.html" rel="nofollow">http://craftinginterpreters.com/hash-tables.html</a>

rurbanover 5 years ago

> SipHash is a relatively fast hash function.Oh my! SipHash is by far the slowest of all practical hash functions. <a href="https://github.com/rurban/smhasher#smhasher" rel="nofollow">https://github.com/rurban/smhasher#smhasher</a>

评论 #21222260 未加载

4 comments

dragontamerover 5 years ago

评论 #21219414 未加载

评论 #21219933 未加载

评论 #21224222 未加载

评论 #21219289 未加载

评论 #21220409 未加载

saagarjhaover 5 years ago

评论 #21218853 未加载

评论 #21219353 未加载

sdegutisover 5 years ago

Also <a href="http://craftinginterpreters.com/hash-tables.html" rel="nofollow">http://craftinginterpreters.com/hash-tables.html</a>

rurbanover 5 years ago

评论 #21222260 未加载

Hash tables

4 comments

Hash tables

4 comments