Professional compiler engineer here, C is a mediocre intermediate language.<p>Let's start with an excellent quote from Wittgenstein.
"The limits of my language mean the limits of my world."<p>Using C as your intermediate language means that your expressiveness is limited to valid C programs. This is workable but only if your language can be mapped to C in _useful_ ways.<p>For example, let's say your language has behavior similar to scheme's tail-call.
How would you get this behavior from a C compiler? You will never be able to make this reliably across optimization levels, etc.<p>Guaranteed tail-calls are the tip of the iceberg, there are a lot more features which cannot be reasonably mapped onto C.<p>Real compiler IRs increase your expressivity beyond what the C language designers decided was important.
As far as I can tell, on all the axes that LLVM excels as an intermediate language (ease of getting started, debugging support, many backends, optimization, flexibility), C is even better. C is easier to get started with, has easier debugging support, has more backends, better optimization, and more flexibility.<p>To take a simple example, I have here a 2428-line C program generated by compiling Linus Åkesson's Game of Life in BF (<a href="http://www.linusakesson.net/programming/brainfuck/" rel="nofollow">http://www.linusakesson.net/programming/brainfuck/</a>) into C using Daniel B. Cristofani's dbf2c.b (<a href="http://www.hevanet.com/cristofd/brainfuck/dbf2c.b" rel="nofollow">http://www.hevanet.com/cristofd/brainfuck/dbf2c.b</a>), which is a BF compiler written in BF. Compiling these 2428 lines of C to machine code using tcc 0.9.25 takes 20ms on my 1.6GHz Atom netbook. Most of this is about 16ms of tcc overhead (startup and shutdown time); the rest is compiling several hundred thousand lines of C per second with tcc. You should get several million lines of C per second with tcc on a modern machine.<p>This isn't optimized code, about equivalent to gcc -O0, typically about 3×–5× slower than optimized code. But that's enormously better than interpretation overhead.<p>(Using decent optimization levels with GCC makes it take several seconds to compile, because GCC's optimizer doesn't deal well with enormous functions.)<p>dbf2c.b, the C-generating BF compiler, is 892 bytes of BF code when stripped. Now, I'm not saying you should write your compilers in deliberately obfuscated programming languages in as few bytes as possible; I'm saying that the fact that this is even possible at all should give you really good feelings about how easy it is to compile things to C.
Indeed. C is brilliant as an intermediate language. One of the greatest things about it is that once you generate C code it's not too much of a stretch to also generate C++/Objective C specific code and take advantage of the libraries written in those languages.<p>Despite this, there are many people out there who view languages that compile to C as being inferior. I still don't understand it.
A few languages that generate C that are currently popular are:<p>* nim<p>* Vala (GObject backend)<p>* Purescript (technically it has a C++ backend, but it is worth mentioning here for the very clean C++ it produces!)
It's also insanely easy to generate code for. In addition to this, you don't have to build an entire toolchain of LLVM stuff to get it to work on your system -- LLVM is a nightmare to setup on Windows.
Here is a list of compilers that can generate C code for different languages:<p><a href="https://github.com/dbohdan/compilers-targeting-c" rel="nofollow">https://github.com/dbohdan/compilers-targeting-c</a>
Portability and interoperability are key in industry.<p>I think all of the new languages are amazing these days. I am learning Rust, Clojure, Haskell, and many more.<p>Buy I can not use any of them at work. I'm in a position to influence teams of smart programmers, but the only language that I would be able to use would be one that fits in to the existing infrastructure, which is natively compiled shared libraries on esoteric unix platforms.<p>So no LLVM. No JVM. No Haskell. I could possibly get away with a lisp or scheme that compiled portable C if I wanted. But that doesn't excite me as much.<p>I would kill for a Rust to human-readable C++ transpiler. I think it could be used immediately by many.<p>I may have to write one.
This begs the question of what would be better - to compile your favorite language to C and use something like Emscripten to make JS, or to compile your code directly to JS?<p>So far as my experiments went, a 3000 native score Dhrystone test compiles to C at 1300, and that C compiles to 850 worth of JS. Which suggests that a direct-to-JS compiler might be worth the efforts...
really wish this was more common, as language choices on embedded platforms are often VERY limited. ie, C.<p>Though on embedded systems you'd also want more constraints for the backend, like do not use dynamic memory, perhaps being able to specify the output code is MISRA compliant.