I'm self taught on compilers. I focus more in "how" do stuff.<p>1- Focus on AST transformations. A lot about parsing but that is the "easiest" part (using a parser generator, pratt parsing or combinators).<p>In the AST is where the "action" is. I even made my toy langs without parsing at all (I build a small internal DSL).<p>2- Not expect much information about the <i>real neat</i> stuff.<p>How make repls for compilers??? how enable debugging?? how represent AGDTs?? How test them??? How do FFI??? Which data structures to base on the rest??? How profile them??? How do type inference?? So, which GC to use?? How implement a GC?? How implement macros and generics?? ie: without lisp. How implement generators?? etc.<p>A LOT you will find in papers. But real examples??? Never.<p>So I think if you wanna get serious learn how read papers. I don't get the weird math them use and my ignorant impression is that VERY few have real information even if understood. Have the abstract math is small potatoes at the time of implementing.<p>So many times I get answers like "is easy dude" and pressing how "just read how the LLVM is made!".<p>3- That is why I'm very glad of<p><a href="http://journal.stuffwithstuff.com/category/language/" rel="nofollow">http://journal.stuffwithstuff.com/category/language/</a>
<a href="http://craftinginterpreters.com" rel="nofollow">http://craftinginterpreters.com</a><p><i>Real gems</i> here.<p>4- You need to read lisp, oCalm & Haskell if wanna get some good ideas. I'm using Rust and the little is there (ie: toy langs) is good!<p>5- I don't know what to do with LLVM and other larger codebases. Too much complications and when done in Java, .NET (except F#), C or worse, C++, codebases the noise is big. Is much clear the samples on oCalm, Haskell sometimes, lisp sometimes.<p>Or in other words, small/medium compilers are better to get stuff.<p>6- Semantics & features. This is the meat. The toy math calculator is too easy. In the moment you wanna do OO, Lazy, AGDTs, Streaming, Structural type system, etc is where you will see how sparce the actual info is. So narrow the kind of semantics/features you look for.<p>Just add this or that could lead to MASSIVE changes in how do the language.<p>For example, I'm doing a relational language (<a href="http://tablam.org" rel="nofollow">http://tablam.org</a>).<p>Is not that conventional, and a lot of info is from the RDBMS guys, and that mean a lot of detour about STORAGE/ACIDs and not actual languages!<p>6- Finally, pick your host language with care. Probably compilers with transpiling not matter much but your host will define the boundaries of how and what your could do "easily".