You have to be on the same machine as the software doing the encryption. It works in roughly the same way that Daniel Bernstein, Eran Tromer, Onur Aciiçmez, and Colin Percival's (independently discovered) attacks work; you run a tight loop measuring memory accesses for cache perturbation, and postprocess the measurements to guess key bits. It has nothing to do with AES - the - algorithm.<p>The paper is interesting (they used an artificial neural network to filter the measurements), but the results aren't ultra-surprising; I think everyone expects more side channels to be discovered on x86 hardware in the coming years, especially since much of the microarchitecture is undocumented. (Note: this paper targets a very old processor, which probably improved their results).<p>The biggest issue with this branch of side channel study --- an issue not mentioned in the paper --- is that "getting on the same machine as an encryption process" is much less hard than it sounds in 2010: many encryption processes run on VMs inside hosting providers, where the cost of situating an attacker on the same metal as the target might be as low as $20.<p>But, long story short: this isn't the crypto attack you should be most worried about. My understanding is that OpenSSL has pushed back on fixing much more straightforward timing channels than this. There are <i>remote</i> attacks that are still worth attention.
from the conclusion it looks like the latest version of OpenSSL mitigates the attack:<p>'''<p>One concrete mitigation strategy has been realized in
OpenSSL 1.0 [18]. There, only the substitution table S is
stored, and then the required multiplications within GF (28)
are computed by using the following relations, which can
be realized in a highly efficient way using the PCMPGTB
instruction:<p><pre><code> +- -+
| +- |
| | ff (int8_t) x > 0 |
2•x = (x << 1) ⊕ | 1b ∧ -+ |
| | 0 (int8_t) x ≤ 0 |
| +- |
+- -+
= (x << 1) ⊕ (1b ∧ PCMPGTB(x, 0))
3•x = 2 • x ⊕ x
</code></pre>
In this case, the required table contains 28 = 256 entries of
20 = 1 bytes each, and on standard x86-architectures with
a cache-line size of 26 = 64 bytes we have that only l = 2
bits of each x∗ are leaked. Looking at Table 1 now shows
that we have p3 = 1, i.e., every k∗ ∈ {0, 1}4·2 is a valid
partial key-column candidate for every x∗ and y∗ . For this
reason, our key search algorithm does not work anymore.
Because of the large prevalence of AES another mitigation
strategy is currently embarked by software vendors. Namely,
they are increasingly often offering hardware support of
AES in their chips, e.g., [25], rendering access-driven cache
attacks impossible.<p>'''
Anybody can explain to us non-cryptographers if:<p>1) This is legit?<p>2) This can work in the real-world and not just in some very specific lab conditions?
I know David and have seen him demo the software in practice and it is almost instant and very surprising. He has been playing with caches for a long time now