There is a constant arms race between GPUs and CPUs to be faster. What is faster today on one, might be slower tomorrow.<p>GPUs are very good at doing lots of floating point math. Historically, CPUs have been better at dealing with branching, multiple instruction issue, out of order execution, integer math, and really pile on the cache architectures. CPUs have SIMD too so they are no slouch for lots of floating point calculations either.<p>Since memory (I/O) is now one of the largest bottlenecks for both GPUs and CPUs because memory bus speeds are much slower than both, this will often be your dominant factor. Since most of your data comes from main RAM, your CPU often lives closer to the data and tends to have aggressive cache architectures (L1, L2, L3 caches) to help, thus giving the CPU an advantage when data is not local.<p>I don't know if it still holds true, but NxM matrix math used to be faster on CPUs for very large values of N,M because for cache locality, the CPU had an easier time keeping values that needed to be reused in the matrices in cache. But GPUs tend to be really good at 4x4 matrices since that is what graphics primarily uses.