> The improvements I did are mostly what could be described as “bottom-up micro-optimizations”.<p>> I also did two larger “architectural” or “top-down” changes<p>My summer intern started doing profiling work on compile times with clang: <a href="https://lists.llvm.org/pipermail/llvm-dev/2020-July/143012.html" rel="nofollow">https://lists.llvm.org/pipermail/llvm-dev/2020-July/143012.h...</a><p>Some things we found:<p>* for a large C codebase like the Linux kernel, we're spending way more time in the front-end (clang) than the backend (llvm). This was surprising based on rustc's experience with llvm. Experimental patches simplifying header inclusion dependencies in the kernel's sources can potentially cut down on build times by ~30% with EITHER gcc or clang.<p>* There's a fair amount of low hanging fruit that stands out from bottom up profiling. We've just started fixing these, but the most immediate was 13% of a Linux kernel build recomputing target information for every inline assembly statement in a way that was accidentally quadratic and not being memoized when it could be (in fact, my intern wrote patches to compute these at compile time, even). Fixed in clang-11. That was just the first found+fixed, but we have a good list of what to look at next. The only real samples showing up in the llvm namespace (vs clang) is llvm's StringMap bucket lookup but that's from clang's preprocessor.<p>* GCC beats the crap out of Clang in compile times of the Linux kernel; we need to start looking for top down optimizations to do less work overall. I suspect we may be able to get some wins out of lazy parsing at the cost of missing diagnostics (warnings and errors) in dead code.<p>* Don't speculate on what could be slow; profiles <i>will</i> surprise you.<p>> Using instruction counts to compare the performance of two entirely different programs (e.g. GCC vs clang) would be foolish, but it’s reasonable to use them to compare the performance of two almost-identical programs<p>Agree. We prefer cycle counts via LBR, but only for comparing diffs of the same program, as you describe.