If you have a 114-page paper, you probably don't have a document of what <i>every</i> programmer should know about memory, especially since many programmers work in domains where some of the recommendations here aren't even possible to follow!<p>Here's a brief summary of what <i>every</i> programmer really needs to know about memory:<p>* There is a hierarchy of memory from fast-but-small to slow-but-large. The smallest and fastest is in L1 cache, which is typically on the order of 10-100 KiB per core, and isn't growing substantially over time. The largest cache is L3 cache, in the 10s of MiB, and shared between many or all cares, while your RAM is in the 10s of GiB. An idea of the difference in access times can be found here: <a href="https://gist.github.com/jboner/2841832" rel="nofollow">https://gist.github.com/jboner/2841832</a>.<p>* All memory traffic happens on cacheline-sized blocks of memory, 64 bytes on x86. (It might be different on other processor architectures, but not by much). If you're reading 1 byte of data from memory, you're going to also read in 63 other bytes anyways, whether or not you're actually using that data.<p>* The slow speed of DRAM (compared to processor clock speed) means that it's frequently the case that trading off a little CPU computation time for tighter memory utilization. For example, storing a 16-bit integer instead of a pointer-sized integer, if you know that the maximum value will be <65,536.<p>* "Pointer chasing" algorithms are generally less efficient than array based algorithms. In particular, an array-based list is going to be faster than linked lists most of the time, especially on small (<1,000 element) lists.<p>That about covers what I think <i>every</i> programmer needs to know about memory. There are some topics that <i>many</i> programmers should perhaps know--cross-thread contention for memory, false sharing, and then NUMA in descending order, I think, but by the time you're in the weeds of "let's talk about nontemporal stores"... yeah, that's definitely not in the ballpack of everyone should know about them, especially if you're not going to mention when nontemporal stores will hurt instead of help [1]. Also, transactional memory is something I'd put on my list of perennial topics of "this sounds like a good idea in theory, but doesn't work in practice."<p>[1] What Drepper omits is that nontemporal stores will evict the cache lines from cache if they're already in cache, so you <i>really</i> shouldn't use them unless you <i>know</i> there is going to be no reuse.