High-performance Hash table design is pretty interesting, because modern CPU architecture has grossly changed their performance characteristics.<p>In the 90s, it was (probably) faster to do linked-list / chaining as a collision resolution mechanism. But because L1 caches are so fast on today's machines (and DDR4 remains very high latency), linear probing seems to be the winner on modern machines.<p>Consider that DDR4 RAM is maybe 200-clock cycles to access (50ns on a 4GHz modern machine). L1 cache can be accessed within 4 clock cycles. Fetching an entire cache line is like guessing on 8-positions in linear probing.<p>The if/else statements will execute all within single-digit nanoseconds (Modern CPUs execute at 4GHz, and can execute more than 1 instruction per clock tick... up to 6-instructions per tick if all instructions are in uop cache and are properly branch-predicted), long before the 2nd location (ex: the 2nd member in a Linked List) can be even accessed.<p>Furthermore, modern CPUs will perform pre-fetching when they notice code traversing through memory linearly. As such, the 200-clock cycle latency is "pipelined" and performed in parallel in practice. Your "effective latency" drops significantly when you traverse linearly.<p>EDIT: And finally: staying on the same DDR4 row (Roughly 1024 bytes) means that you only have to perform a Column-read command over and over again. If you leave a DDR4 Row, the RAM needs to precharge, open the new row, and then perform a new column read. Takes roughly 3x longer than reading from an already open column.
> A hash function is a one-way function that produces the same output for a given input. It’s one-way in the sense that, given the hash function, it should be difficult to convert the output back to the input without trial and error.<p>Usually I like to say that they convert an input into a uniform distribution of fixed length output, since reversibility isn’t really important if you using it for a non-cryptographic purpose such as this one.
Also <a href="http://craftinginterpreters.com/hash-tables.html" rel="nofollow">http://craftinginterpreters.com/hash-tables.html</a>
> SipHash is a relatively fast hash function.<p>Oh my!
SipHash is by far the slowest of all practical hash functions.
<a href="https://github.com/rurban/smhasher#smhasher" rel="nofollow">https://github.com/rurban/smhasher#smhasher</a>