Common misconceptions about compilers

190 点作者 baziotis6 个月前

15 条评论

> As an example, in my desktop, it takes ~25min to compile a Debug version of LLVM, but ~23min to compile a Release version.oh I think I know what might cause this: TableGen. The `llvm-tblgen` run time accounts for a good chunk of LLVM build time. In a debug build `llvm-tblgen` is also unoptimized, hence the long run time generating those .inc / .def files. You can enable cmake variable `LLVM_OPTIMIZED_TABLEGEN` to build `llvm-tblgen` in release mode while leaving the rest of the project in debug build (or whatever `CMAKE_BUILD_TYPE` you choose).

评论 #42375001 未加载

评论 #42376355 未加载

评论 #42375587 未加载

评论 #42397597 未加载

mshockwave6 个月前

> The compiler optimizes for data locality> So, we have a single array in which every entry has the key and the value paired together. But, during lookups, we only care about the keys. The way the data is structured, we keep loading the values into the cache, wasting cache space and cycles. One way to improve that is to split this into two arrays: one for the keys and one for the values.Recently someone proposed this on LLVM: <a href="https://discourse.llvm.org/t/rfc-add-a-new-structure-layout-optimization-pass/80596" rel="nofollow">https://discourse.llvm.org/t/rfc-add-a-new-structure-layout-...</a>Also, I think what you meant by data locality here is really optimizing data layout, which, as you also mentioned, is a hard problem. But if it's just optimizing (cache's) locality, I think the classic loop interchange also qualifies. Though it's not enabled by default in LLVM, despite being there for quite a while.

评论 #42374707 未加载

Animats6 个月前

About C/C++ compilers, mostly.C++ has terrible template and include compilation problems, for historical reasons.

评论 #42374466 未加载

rkagerer6 个月前

How about reproducibility?I always took for granted most compilers will generally output consistent binary code given identical inputs. Then the other day I saw a post here about optimization timeouts and such.

评论 #42377375 未加载

评论 #42382268 未加载

eru6 个月前

> Knowing which paths are hot in any Javascript bytecode is not enough to produce optimized code; you also need to know the types. And you only know the types at runtime. This is the main reason why JIT compilers compile code at runtime.You can often make good guesses statically. Especially if your JavaScript was produced from something like a TypeScript source.

评论 #42374432 未加载

评论 #42374643 未加载

评论 #42378424 未加载

评论 #42376546 未加载

timeforcomputer6 个月前

Why does optimal substructure work for code size? Couldn't you take a compiled non-minimal function and break each instruction into a function call (with some assumptions about the ISA maybe), then assuming each single-instruction function is minimal by itself, infer the entirety is minimal, contradicting?

评论 #42375191 未加载

评论 #42375061 未加载

评论 #42396858 未加载

ykonstant6 个月前

Nice write-up and overall very informative blog. Kind of off topic, but are you Stefanos, doing your PhD at UIUC? That's my alma mater, I hope you find the academic* environment there stimulating!*Let's not talk about the social environment of UC.

评论 #42375507 未加载

dspillett6 个月前

> -O3 produces much faster code than -O2Also “-Ox gives similar results all the time”. Any optimisation is a gamble, some of them quite unsure gambles. Different cache sizes and management strategies between processing units mean that what might be a small benefit for one could be a detriment when the code is running on another, and even if you specifically tune the optimisations for a given set of chips you are at the mercy of the incoming data which could invalidate any educated guesses made about branch likelihood and so forth.

kazinator6 个月前

If you have both an interpreter and compiler, then boostrapping the (self hosted) language is a lot simpler. The interpreter is written in some widely available language like C. The interpreter runs the compiler and standard library code of your language in order to compile that compiler and standard library. After that, those parts execute as native machine code or byte code (with or without JIT).Without the interpreter, you have to ship the compiled versions of those parts, even to someone building your language from scratch. Or perhaps ship an entire prebuilt package of the language.An interpreter creates a third opinion on what code should do, resulting in a documentation-interpreter-compiler triangle, where in a situation in which it is unclear what the right behavior is, weight can be given to the situation that two out of those three pieces agree.Optimized versus unoptimized isn't exactly the same, because there is a dependency. If unoptimized translation has a certain behavior, and optimization doesn't wreck it (prevalent case), then of course the two are going to agree. Their agreement doesn't carry weight against documentation. An interpreter and compiler are almost totally independent, except for where they get the code to share run-time support and library functions.

mshockwave6 个月前

Re: Why does link-time optimization (LTO) happen at link-time?I think maybe LLVM's ThinLTO is what you're looking for where whole-program optimization (more or less) happens in middle end.

评论 #42374765 未加载

norir6 个月前

Good observations but a more accurate title would be common misconceptions about c/c++ compilers. I point this out because I think that there is a lot of room for compiler innovation that will come from developing new compilers for new languages and I don't want us to continue being anchored in the mistakes of the past.

评论 #42378414 未加载

ben-schaaf6 个月前

> For example, minijava-cpp is a MiniJava compiler written in C++ which has some pretty heavy code in it. It compiles in ~0.7s in my laptop, while linking all these separate files takes much longer.I can't reproduce this result at all. Using compile.sh it takes my 7950x 0.9s to compile the whole project. A quick meson/ninja setup took 0.8s from a clean build directory, and just 0.043s to link using gnu-ld. Using faster linkers yields even better results: 0.025s for gold and 0.013s for mold. I'm curious how you got your results?> SQLite, too, compiles everything into a single .c program.It takes my computer 25 seconds to compile sqlite when making incremental changes. My work's similarly sized C++(!) repo links 3 executables in 0.07s using mold.I'm having a really hard time seeing how unity builds could possibly be faster. Maybe with an exceedingly terrible linker or spinning rust?

baziotis6 个月前

Hey folks, author here. I'm really happy the article has gotten such attention. I hope it's good attention and that you folks learned something! Unfortunately, though, it's _very_ hard to keep track of all the comments in HackerNews. So, if you have a comment you'd like me to reply to and I haven't, please either use the Reddit post: <a href="https://www.reddit.com/r/Compilers/comments/1hapetl/common_misconceptions_about_compilers/" rel="nofollow">https://www.reddit.com/r/Compilers/comments/1hapetl/common_m...</a>or head over to my website: <a href="https://sbaziotis.com/" rel="nofollow">https://sbaziotis.com/</a> for more contact info.Best, Stefanos

eru6 个月前

If the author is here: the link from the table of contents to '-O0 gives you fast compilation' is broken.

评论 #42374275 未加载

high_na_euv6 个月前

>Separate compilation is always worth it>Separate compilation is the idea that you compile a different object file for each source file.22 The reasoning of why this is beneficial is that if you change one source file, you only have to re-compile that one file and "just" link it with the other object files; you don't have to re-compile the whole project.That reflects my feelings that linker is bullshit step

评论 #42376650 未加载

15 条评论

mshockwave6 个月前

评论 #42375001 未加载

评论 #42376355 未加载

评论 #42375587 未加载

评论 #42397597 未加载

mshockwave6 个月前

评论 #42374707 未加载

Animats6 个月前

About C/C++ compilers, mostly.C++ has terrible template and include compilation problems, for historical reasons.

评论 #42374466 未加载

rkagerer6 个月前

评论 #42377375 未加载

评论 #42382268 未加载

eru6 个月前

评论 #42374432 未加载

评论 #42374643 未加载

评论 #42378424 未加载

评论 #42376546 未加载

timeforcomputer6 个月前

评论 #42375191 未加载

评论 #42375061 未加载

评论 #42396858 未加载

ykonstant6 个月前

评论 #42375507 未加载

dspillett6 个月前

kazinator6 个月前

mshockwave6 个月前

Re: Why does link-time optimization (LTO) happen at link-time?I think maybe LLVM's ThinLTO is what you're looking for where whole-program optimization (more or less) happens in middle end.

评论 #42374765 未加载

norir6 个月前

评论 #42378414 未加载

ben-schaaf6 个月前

baziotis6 个月前

eru6 个月前

If the author is here: the link from the table of contents to '-O0 gives you fast compilation' is broken.

评论 #42374275 未加载

high_na_euv6 个月前

评论 #42376650 未加载