<i>I think we can safely say that we never "flush" the CPU cache within our programs.</i><p>Perhaps true, but not for lack of trying!<p>For benchmarking compression algorithms from a cold cache, I've been trying to intentionally flush the CPU caches using WBINVD (Write Back and Invalidate Cache) and CLFLUSH (Cache Line Flush). I'm finding this difficult to do, at least under Linux for Intel Core i7.<p>1) WBINVD needs to be called from Ring 0, which is the kernel. The only way I've found call this instruction from user space is with a custom kernel module and an ioctl(). This works, but feels overly complicated. Is there some built in way to do this?<p>2) CLFLUSH is straightforward to call, but I'm not sure it's working for me. I stride through the area I want uncached at 64 byte intervals calling __mm_clflush(), but I'm not getting consistent results. Is there more that I need to do? Do I need MFENCE both before and after, or in the loop?