I really like Raymond Hettinger's video on Python dictionaries: <a href="https://www.youtube.com/watch?v=npw4s1QTmPg">https://www.youtube.com/watch?v=npw4s1QTmPg</a>
This is deeply wrong.<p>> Python then calculates the hash value for each key in the dictionary using the MurmurHash3 hash function.<p>Umm... this isn't right. At all. Depending on configuration, Python uses an external hash, a Modified Fowler-Noll-Vo (FNV) hash, or SipHash.<p>Here's a quote from Include/pyhash.h :<p>* The values for Py_HASH_* are hard-coded in the
* configure script.
*
* - FNV and SIPHASH* are available on all platforms and architectures.
* - With EXTERNAL embedders can provide an alternative implementation with::<p>and the implementations are in Python/pyhash.c .<p>The only "murmur" in the repo is in Tools/peg_generator/data/top-pypi-packages-365-days.json as a mention of a PyPI modules.<p>And the linked-to Wikipedia page at <a href="https://en.wikipedia.org/wiki/MurmurHash" rel="nofollow">https://en.wikipedia.org/wiki/MurmurHash</a> says: "The authors of the attack recommend to use their own SipHash instead" to avoid collision attacks.<p>(Also, the example hash is 5478795832145536229 which is a 64-bit hash, while the Wikipedia page says MurmurHash3 generates a 32-bit or 128-bit hash value.)<p>> When a hash collision occurs, Python stores multiple key-value pairs in the same bucket and uses a linked list to store them.<p>Umm ... that isn't right either.<p>A linked list is an "open hashing"/"separate chaining" hash table, but Python uses "closed hashing"/"open addressing." As the linked-to Python source explains, "Open addressing is preferred over chaining since the link overhead for
chaining would be substantial (100% with typical malloc overhead)."<p>skilled's comment about this being ChatGPT generated is flagged and dead, but ChatGPT seems good at being both wrong and confident. Just like this essay.