For those interested about RAM timings, effects of cache, and other low level memory topics explained exceptionally well, check out "What Every Programmer Should Know About Memory"[1]<p>[1] <a href="http://people.redhat.com/drepper/cpumemory.pdf" rel="nofollow">http://people.redhat.com/drepper/cpumemory.pdf</a>
Has anyone run similar tests on the JVM after warmup?<p>Theoretically the JVM should be pushing memory around to negate as much of the cache penalty as possible, and I've definitely noticed this in some algorithms (this is how people come up with code where Java outperforms almost identical code in pure C, and I've seen 3:1 speed differences in favor of the JVM), but I'd be curious to see the graphs.<p>Maybe I'll do it later today if I get a few minutes...
Nice summary. His "example 6", about false cache sharing, bit me once when I was allocating data structures for my threads and they ended up close enough in memory that threads shared cache lines. It took a while before I figured it out and probably the only reason I did was that it didn't happen all the time, so I was wondering why the code sometimes ran slower. If it had been consistent I would just have thought the code was slow. While there are profilers that spit out cache hit rates, it's usually on the scale of whole threads which makes it very difficult to find which code is bad. You also have to have enough experience to know what a "normal" hit rate is, which I'm not sure I am.
The next level after caching relates to paging and virtual memory. Submission to an item explaining this in a humorous fashion here:<p><a href="http://news.ycombinator.com/item?id=1032528" rel="nofollow">http://news.ycombinator.com/item?id=1032528</a>