I've skimmed the paper, and I'm not sure what to make of it. Definitely you need tools, but even basic profiling is a good start and quite rare. Suggesting changing source code because of the behaviour of a particular compiler version's optimizer is a bad idea generally, especially without checking the optimization reports and investigating tuning the available parameters, like cost models. Optimizers can be quite unstable for good reasons.<p>The paper isn't concentrating on what are probably the two most important performance governors these days, vectorization and fitting the memory hierarchy, which can make a factor of several difference. ("Your mileage may vary", of course.) Then it isn't obvious whether comparing different generations of compilers was even sufficiently controlled; their default tuning may be for different micro-architectures which were considered the most relevant at the time. GCC performance regressions are monitored by maintainers, but I wonder what real optimization bugs uncovered in a maintained version have been reported as such from this sort of work.<p>In large scale scientific code, such compiler optimizations may be relatively unimportant anyway compared with time spent in libraries of various sorts: numerical (like BLAS, FFT), MPI communication, and filesystem i/o. Or not: you must measure and understand the measurements, which is what users need to be told over and over.