From the linked github code:<p><pre><code> /*
Intel actually recommends calling CPUID to serialize the execution flow
and reduce variance in measurement due to out-of-order execution.
We don't do that here yet.
see §3.2.1 http://www.intel.com/content/www/us/en/embedded/training/ia-32-ia-64-benchmark-code-execution-paper.html
*/
static int64_t cpucycles(void) {
unsigned int hi, lo;
__asm__ volatile("rdtsc\n\t" : "=a"(lo), "=d"(hi));
return ((int64_t)lo) | (((int64_t)hi) << 32);
}
</code></pre>
Where `cpucycles` returns the number of CPU cycles since its reset (as 64 bit integer). [1]<p>Seems really useful for benchmarking!<p>Note that this code was last updated Oct 2020 and there is a note on top. Also reading about this instruction, sounds like it doesn't work in hyperthreaded CPUs? I was wondering if anyone knows an even more accurate (?) version of this code? Possibly by conforming to Intel's suggestion (or not)?<p>Author in the article claims:<p>> (The code calls the RDTSC instruction to get accurate cycle-level timing measurements.)<p>This implies the above `cpucycles` is more accurate than using high resolution clocks. Is this still accurate in multicore or hyperthreaded CPUs? if not does this mean dudect doesn't work as accurately in such systems?<p>Really curious about this, if anyone can point me to more code I'll be very happy!<p>[1] <a href="https://en.wikipedia.org/wiki/Time_Stamp_Counter" rel="nofollow">https://en.wikipedia.org/wiki/Time_Stamp_Counter</a>
I wonder if there isn't just a simple way to make timing attacks impossible. Find out the longest possible runtime, add some overhead, and then do the encryption in a different thread or process. Then call back after a constant time. It seems like this should be a primitive that is used a lot.