Too bad NVIDIA's (theoretical) FLOP numbers are always single precision (which is misleading if you are comparing it to supercomputers), as their theoretical double precision FLOPs are always ~1/4th of their theoretical single precision numbers. The other problem is that the CUDA shader cores are relatively from the ARM cores, which adds signifigant latency. While this isn't really a problem for video rendering and other GPU tasks, this makes it significantly worse for any sort of processing that has a lot of random accesses (most compute-heavy workloads). I don't get why NVIDIA tries to brag about compute performance, which always under delivers compared to what they claim, when their chips are the best when it comes to what most end users actually care about... media/video processing.<p>(Disclaimer: I am founder of a startup, <a href="http://rexcomputing.com" rel="nofollow">http://rexcomputing.com</a>, working on a new processor for high performance computing applications, which would be competing with this chip for supercomputers, but not in any mobile/consumer tech.)
Last decade's supercomputer(!) in today's pocket. Really, the sheer speed at which computing technology evolves is mind-blowing. I don't even want to guess where we are in ten years...