V1: (PDP-7 Unix) Kernel is written in assembly. C does not exist yet.<p>V2: (PDP-11 Unix) Kernel is written in assembly. C compiler is written in assembly.<p>V3: Kernel is written in assembly. C compiler is written in C.<p>V4: Kernel is written in C. C compiler is written in C.<p><a href="https://www.tuhs.org/cgi-bin/utree.pl?file=PDP7-Unix" rel="nofollow">https://www.tuhs.org/cgi-bin/utree.pl?file=PDP7-Unix</a><p><a href="https://www.tuhs.org/cgi-bin/utree.pl?file=V2" rel="nofollow">https://www.tuhs.org/cgi-bin/utree.pl?file=V2</a><p><a href="https://www.tuhs.org/cgi-bin/utree.pl?file=V3" rel="nofollow">https://www.tuhs.org/cgi-bin/utree.pl?file=V3</a><p><a href="https://www.tuhs.org/cgi-bin/utree.pl?file=V4" rel="nofollow">https://www.tuhs.org/cgi-bin/utree.pl?file=V4</a>
We had a few pretty cool courses in university touching this thing. It depends a bit on the work you want to do, but:<p>- Pick a language that's simple enough. A subset of ML would be good, but if you want to complete it, I'd recommend a simple LISP. This is your new language. This is C.<p>- Use a language you know and like to implement a compiler for this language. This is your bootstrap language. Compile this for example into C--, ASM or LLVM, depending on what you know. This is the target language. As a recommendation, keep this compiler as simple as possible so you have a reference for the next step. For C, both the bootstrap and the target language were ASM.<p>- And now iterate on extending the stdlib your language has, until you can implement a compiler for your new language in your new language. Again, keep this compiler simple without optimization or passes, just generate the most trivial machine code possible. This usually takes a bit of back-and-forth. You'll need some function evaluation first, some expression evaluation first (this is where a lisp can be an advantage, as those are the same), then you need function definitions, then you need filesystem interactions and so on. You kinda discover what you need as you implement.<p>- Once you have all of that, (i) compile the compiler for the new language in your bootstrap language and (ii) compile the compiler for the new language using the result of (i). If you want to verify the results, compile the compiler again with the output of (ii) and check if (ii) and (iii) are different.<p>- Your new language is now self-hosted.<p>This was fun, because it was accompanied with other courses like how processor microcode implements processor instructions, how different kinds of assembly is mapped onto processor instructions, and then how higher level languages are compiled down into assembly. All of this across 4-6 semesters resulted in a pretty good understanding how Java ends up being flip-flop operations.<p>EDIT - got target & bootstrap mixed up in first part.
Computerphile on youtube has a pretty good series on bootstrapping a compiler:<p><a href="https://www.youtube.com/watch?v=lJf2i87jgFA&list=RDLVlJf2i87jgFA&start_radio=1&rv=lJf2i87jgFA&t=2">https://www.youtube.com/watch?v=lJf2i87jgFA&list=RDLVlJf2i87...</a>
see Dennis Ritchie's <i>The Development of the C Language</i>: <a href="https://www.bell-labs.com/usr/dmr/www/chist.html" rel="nofollow">https://www.bell-labs.com/usr/dmr/www/chist.html</a><p>pre-C Unix was written in assembly
The start of C and the start of Unix are approximately at the same time. In that time, there was a stepwise process. The first Unix (IIRC, someone can correct me) was written in assembly. Also written at that time was a simple compiled language. That compiled language was used to write a more complicated compiler. There was another such cycle (I think) before there was a C compiler - and even then, it wasn't the full C, but a subset. Unix was then re-written in C.
Bootstrapping is a fun thing. You already have some good answers.<p>Now imagine how most things around you were made, how higher tech was made with lower tech. How they made high precision tools, when there were only lower precision tools available? For example: how to make a 0.001 mm precise caliper when all you have is a 0.1 mm one? There were a lot of challenges like that and we still get to new ones. I just wonder what general term is used for things like that.
Any self-hosting language compiler has to start from somewhere.<p>Generally it's an incremental process where the compiler for an early/subset version of the language is written in another, existing language (absent one, may be assembly code).<p>Once it's possible to rewrite the compiler in its own subset language, it becomes self-hosting. Then you can add a feature to the language, and once it works, enhance the compiler to use it, and so on.<p>Eventually the language and compiler go hand-in hand: The only way to compile it is with a compiler, and the only way to compile the compiler is with itself. This leads to interesting thought experiments such as:<p><a href="https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf" rel="nofollow">https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...</a>
As the question has already been answered, I'll just add that the general principle is known as self-hosting:<p><a href="https://en.wikipedia.org/wiki/Self-hosting_(compilers)" rel="nofollow">https://en.wikipedia.org/wiki/Self-hosting_(compilers)</a>
The question is malformed because there is no need for Unix to run C programs. In theory they could have created C without an OS. However, in practice C was written to rewrite Unix in a high level language (previously it was assembly).
A related question: was an Operating System first known as a Time Sharing System? I've seen most of what I've known as an OS in this era termed that, also I read that the term OS was coined for CP/M.
Lots of C implementations don't/didn't use Unix as the underlying OS.<p>There were Cs for MSDOS, Cs for CP/M, even Cs for Windows, etc, etc.