The post goes directly from <i>"heap allocations were frequent within rustc"</i> to <i>"effort to minimize heap allocations"</i>, and proceeds to detailing speedups.<p>→ Systems programming newbie question: why are heap allocations bad for performance? Is it the additional level of indirection? The cost of calling your memory allocator? Something else?<p>My background, if that helps focusing answers: python/js programmer, did a tiny bit of C/C++, am ~approximately~ familiar with the stack (call frames, each with its context) vs. the heap (where to allocate memory for big/long-lived objects e.g. arrays and trees).
Here's some thinking outside the box. Traditional compilers are focused on building an executable as fast as possible, throwing away an enormous amount of state each time. The new rustc incremental compilation attempts to re-use some of that computed state, although still early days. If however the compiler's state remains persistent (it is running as a daemon) then small code changes should usually be pretty fast - it's analogous to the code scanners of Eclipse. If the target of the compiler isn't a native executable, but a continuously updated image containing code, then the link gets pretty fast as well. The result will not be fast, but it will be _fast enough_ to test changes.
Can rust capitalize on LTO/PGO? Even if that's not quite ready for primetime, if we're spending 50% or more time in the LLVM backend, that certainly can be built with LTO/PGO.<p>Seems like it might be worth the trouble/bootstrapping challenge if it yields another ~5%.
Nice article, although I would have liked to see a before/after summary bar chart of the benchmarks for all PRs as a whole; I'm curious how all these incremental improvements add up together.