People write minimal Lisps because they're easy while being elucidating. You can describe a compiler for a more-than-minimal Lisp in a forum post.<p>1) Specify representations for: integers, strings, symbols, and cons-es; specify a data structure for the symbol table and define INTERN properly.<p>2) Implement the reader, cribbing heavily from CLHS 2.1.4, 2.2. Set up constituent, whitespace, and terminating character classes, and implement reader macros for at least quote, double quote, and left/right parens.<p>3) Specify a representation for the environment, which will map symbols to offsets of variables on the stack frame; have a stack of these objects.<p>4) Figure out what special operators to handle. You can get away with just: lambda, progn, if, quote, let.<p>5) Starting from each top level form, walk the code recursively. Each element you encounter will either be:
a) a top-level definition
b) a single symbol variable reference
c) a list with a symbol at its head which is a special form
d) a list with a symbol at its head which is a function call<p>As you walk the code, push a new environment onto the stack for each lambda and let you come across, which you pop off when you're done walking the corresponding body (but make sure you save it somewhere).<p>5.1) In one pass, collect information on closed-over variables. Walk the code looking for variables which are referenced (5b) but are not present in the environment. In that case search up the stack. If you find it in a parent environment, it's a closed-over variable. If you don't find it at all, it's a global variable.<p>5.2) Use the environment structures you just collected to assign local variables to stack slots. Figure out a closure representation (use a code pointer followed by an array of closed-over variables). Use the info you collected in (5.1) to assign variables to stack slots and closure variable references to closure variable slots.<p>5.3) In a second pass, generate code. When you reach:<p>a) A TLD, emit a call to the runtime to associate a global name with a lambda
b) A variable reference, emit code to either load the value from the stack slot noted in your environment structure, or from the appropriate closure slot, or emit a call to the runtime to look up a global value.
c) A special form: generate function prologues/epilogs when you reach a lambda form, use labels and jumps to implement 'if', and use whatever data facilities your assembler has to handle quote-ed literals. When you reach a nested lambda, emit its code as a separate function, then at the point where it is used as a value in the parent function call a closure creation function providing it with the address of the generated code plus the values of all closed-over variables.
d) A function call, emit a call to the runtime to look up the global value, then evaluate each of the arguments left to right, pushing them onto the stack. Then, jump to the body of the function.<p>6) Implement the runtime functions for cons handling, the global symbol table, and closure creation.<p>Extend as you see fit to include mutable variables, etc.