This article focuses pretty heavily on the possibility of cache timing attacks against AES, and cites djb's original work along with Tromer/Osvik's publication in 2005.<p>Last week at CCSW, we published a paper[1] detailing our attempts to bring these attacks to bear against Chromium.<p>In short, we don't see AES cache timing attacks as possible on more recent processors, and especially so once you factor in the sheer size of modern architected code.<p>[1] <a href="http://cseweb.ucsd.edu/~kmowery/papers/aes-cache-timing.pdf" rel="nofollow">http://cseweb.ucsd.edu/~kmowery/papers/aes-cache-timing.pdf</a>
I do not understand the jump from the NSA having a history of building systems from the chip up to reasoning by analogy that the same is true for NIST (The shared worldview link is 20 years old). I'm not disagreeing with the statement, I just do not see any support for the conclusion that NIST's is bad for the general public because unlike NIST's target customers we are not building custom chips.<p>Can anyone shed any light?
I just checked and all of the computers and devices I own for work have AES hardware in them (Mac Mini, Macbook Air, iPhone). Maybe NIST thinks that, through standardization efforts, they can encourage more people to integrate such hardware over the long term?<p>The amount of hardware support that AES has already is pretty substantial: <a href="https://en.wikipedia.org/wiki/AES_instruction_set" rel="nofollow">https://en.wikipedia.org/wiki/AES_instruction_set</a><p>I'd rather not suppose there's something insidious going on here, just that maybe NIST is taking a longer-term view than racing to put AES and SHA3 in everything yesterday.
Käsper and Schwabe's bitsliced AES [1] does not need very long streams to be fast. It processes 8 blocks simultaneously, not 128 (as a 'pure' bitsliced approach would), and therefore reaches peak performance at relatively small lengths, starting at 128 bytes.<p>[1] <a href="http://cryptojedi.org/papers/aesbs-20090616.pdf" rel="nofollow">http://cryptojedi.org/papers/aesbs-20090616.pdf</a>
Hardware will evolve. CPU's design constraints — programs with low parallelism and not much awareness of the memory hierarchy — have caused a bottleneck. SHA-3 will end up as yet another specialty instruction, with the actual programming done by the hardware vendor. For people who don't want to be dependent on that, I imagine GPUs provide a faster and more flexible alternative.
It seems strange that the author is complaining about AES speed. About a year ago, I benchmarked an IPsec setup between two cheap routers with an ARM9 processor that did not have any special crypto blocks in it. AES significanly outperformed the other algorithms I tried.
Intel have specific instructions for GCM that mitigate some of this stuff I'm sure. I know this doesn't translate to 'NIST are keeping software implementations in mind', but when these things are available on a few processors that does make the software guy's job easier.