The paper covers inefficiencies in compiler-generated binaries, either from missed optimization opportunities or from the compiler's own poor code generation.<p>The paper shows that some of these perceived inefficiencies can be resolved by manual changes on a few lines of code, such as eliminating redundancies. However, the paper then demonstrates that such manual changes can sometimes make the results <i>worse</i> because a particular compiler might not be able to perform inlining or vectorization.<p>The paper concludes the optimization study by stating, "no compiler is better or worse than another production compiler in all cases." (The study was on GCC, LLVM, and ICC.)
A lesson I learned as an undergraduate: Sometimes changing optimization settings can yield different answers.<p>It's been ~17 years since then, so I can't recall the options used, but seeing different results emerge when I used a higher optimization option caused an unanticipated puff of smoke from my little brain.<p>A possible moral of the story -- test every little change.
I've skimmed the paper, and I'm not sure what to make of it. Definitely you need tools, but even basic profiling is a good start and quite rare. Suggesting changing source code because of the behaviour of a particular compiler version's optimizer is a bad idea generally, especially without checking the optimization reports and investigating tuning the available parameters, like cost models. Optimizers can be quite unstable for good reasons.<p>The paper isn't concentrating on what are probably the two most important performance governors these days, vectorization and fitting the memory hierarchy, which can make a factor of several difference. ("Your mileage may vary", of course.) Then it isn't obvious whether comparing different generations of compilers was even sufficiently controlled; their default tuning may be for different micro-architectures which were considered the most relevant at the time. GCC performance regressions are monitored by maintainers, but I wonder what real optimization bugs uncovered in a maintained version have been reported as such from this sort of work.<p>In large scale scientific code, such compiler optimizations may be relatively unimportant anyway compared with time spent in libraries of various sorts: numerical (like BLAS, FFT), MPI communication, and filesystem i/o. Or not: you must measure and understand the measurements, which is what users need to be told over and over.
How does this pan out for major sci libraries, like NAG, IMSL/MKL, GSL, BLAS, LAPACK etc.? Do distributions control the optimization levels used with a particular compiler, or just make sure it simply passes the tests?
I'm seeing these kind of titles (what every X programmer should know about Y) quite frequently which seems like the trend of adding "for humans" in front of library names.