I love this. Many a moon ago, I worked on a system called Aikido at MIT, which combined a special built hypervisor with a binary rewriting system (DynamoRio) to enable efficient time travel debugging and race detection of parallel applications.<p>If anyone's interested, here's a publication that talks about it in more detail:<p><a href="https://dspace.mit.edu/handle/1721.1/72082" rel="nofollow">https://dspace.mit.edu/handle/1721.1/72082</a><p>The use of performance counters here also reminds me of another project I worked on called Kendo, which was a posix thread like replacement that used performance counters to enforce a deterministic interleaving of synchronization operations (mutexes, etc). The system could guarantee determinism for programs that didn't have race conditions. Back then, I found that counting instructions wasn't deterministic on the processors of the time, but counting store operations was. If anyone's interested in that work, here's the publication:<p><a href="http://www.cag.csail.mit.edu/~mareko/asplos073-olszewski.pdf" rel="nofollow">http://www.cag.csail.mit.edu/~mareko/asplos073-olszewski.pdf</a>