One behavior I've noticed with linux that if you read files sequentially from disk (for example, doing scp), then linux would fill all the memory with those files' contents and then it would swap out everything but the (obviously useless) disk caches.
So you'll have all the memory filled with data you would never need again and trying to do anything would cause a large and painful unswapping (had side effect of halting my qemu).<p>This is true insanity. Surely you can disable swap or tune swappiness, but what's the reason for crazy default behavior?
I still don't comprehend why one needs swap at all. All the explanations I have come across talk about not having enough memory. Given that one has at least 8 GB of memory, or maybe even >100GB, why on earth would you need swap? Sure some process might allocate even more than that, but maybe it's better to refuse such a request than to slow down the whole system due to thrashing.<p>I get the idea that the reason might be that a lot of programs allocate memory which they don't actually need regularly, which is then very convenient to swap out. Rather than enabling this bad habit using slow disk storage it would be much better to expect programs to be more frugal, or at least signify whether something should be kept in memory or not.
I am very impressed with how well written this article is. A brief description of the problem, links to relevant discusions for less informed readers to come up to speed, and clear examples of how key pieces of information were gathered. I learned more from this article about the topic at hand than I have from a Blog post in recent memory.
Anther good article is <a href="https://kevinclosson.wordpress.com/2009/05/14/you-buy-a-numa-system-oracle-says-disable-numa-what-gives-part-ii/" rel="nofollow">https://kevinclosson.wordpress.com/2009/05/14/you-buy-a-numa...</a><p>On commodity servers, unless you have specific reasons to do otherwise just switch from NUMA to SUMA. There are two things yo should do<p>* Change a BIOS setting. The term for this will vary by manufacturer. For Dell, it means enabling node interleaving.<p>* Pass numa=off to the linux kernel (e.g. edit grub.cfg)
Yes, NUMA effects will really kill you, though how much depends on the particular quad-proc topology. I have some measurments for the interested in a small workshop paper I put together (I gathered the numbers in the context of tuning our garbage collector anyway):<p><a href="http://dl.dropbox.com/u/1620890/website/writings/mspc12-stream.pdf" rel="nofollow">http://dl.dropbox.com/u/1620890/website/writings/mspc12-stre...</a>
Will this optimization help on virtualized machines like Xen? Or does all memory appear to be the same?<p>On an EC2 m1.xlarge:<p>$ numactl --hardware
available: 1 nodes (0)
libnuma: Warning: /sys not mounted or invalid. Assuming one node: No such file or directory
node 0 cpus:
node 0 size: <not available>
node 0 free: <not available>
libnuma: Warning: Cannot parse distance information in sysfs: No such file or directory
For those of us mere mortals, would it be safe to assume that adding the suggested line to mysql_safe would be ok to do?<p>cmd="/usr/bin/numactl --interleave all $cmd"
The title is inaccurate. It should say, "the linux swap insanity problem" because this is entirely related to the linux kernel. It just happens to affect MySQL and similar workloads, but it is not MySQL's fault. It doesn't behave that way on other platforms either.
tl;dr - If you're running a database, or generally memory intensive system, while also using multiple CPUs you should run this command: echo 0 > /proc/sys/vm/zone_reclaim_mode<p>But the article is great. You should definitely read it.