> The 5.0 VM is a register machine, which operates on a set of virtual registers that can store and act on the local variables of a function, in addition to the traditional runtime stack.<p>This is a common source of confusion, because the name "register machine" makes people think about CPU registers. However, the registers in a register VM are merely slots in the traditional runtime stack. The difference between a stack and register machine has to do with how the bytecode instructions are encoded. In a stack machine, most instructions implicitly pop arguments from and push their results to the top of the stack. The instructions are byte-sized, encoding just the operation name. For example, to add 10+10<p><pre><code> LOADK 10
DUP
ADD
</code></pre>
Meanwhile, in a register machine the instructions can read and write to any slot in the stack. Instructions are larger, because in addition to the operation name they also encode the indexes of the input slots and the index of the output slot. But it's worth it because you need less instructions in total and do less stack shuffling.<p><pre><code> LOADK 1 10
ADD 1 1 2</code></pre>
Lua's simplicity is sometimes it's real selling point. I was just today searching for a small scripting language to implement in a mobile app in .net, where app size is a premium, and it turns out that the smallest useful JavaScript interpreter is at least 3x the size of a Lua interpreter.<p>I do believe that an un-bloated JavaScript language from when it was just invented would be simpler than Lua (as both were designed as "scripting" languages, not as main ones), but history didn't go that route :)<p>But... Lua is WEIRD! Weird nomenclature, weird string concatenation operand, 1-based arrays, too clever "tables" and "metatables" stuff.
I'd love to see upvalues diagrammed as they are represented in memory.<p>It sounds like the stack is perhaps a Stack<Frame pointer¹>, where each Frame contains the locals for that stack frame; then a coroutine just needs to keep a pointer to the stack frame it is closing over. (And then, Frame does too, recursively, incase there is more than one scope being closed over.)<p>This would be extremely similar to … most any other language … and makes me wonder why Lua gives them such a unique name. It has been hard to really comprehend the Lua spec, when I've tried to understand that facet of it.<p>(I'd also argue that Lua isn't as simple as it is made out to be: primitives behave wildly different from objects, there's the 1-based indexing, multiple-return is pretty much unique to Lua (vs. returning a tuple, which other languages such as Python, Rust, and sort-of JS, go for; I think that's conceptually simpler).)<p>¹and note "pointer" here might really be "GC ref", to permit these to be GC'd as necessary, as closures can keep stuff alive far longer than normal.
Lua 5.0 doesn't have the incremental garbage collector, only the original mark-and-sweep collector.<p>The pdf they based the blog post off of even says the incremental GC is <i>upcoming</i> in 5.1, and you can read through <a href="https://www.lua.org/source/5.0/lgc.c.html" rel="nofollow">https://www.lua.org/source/5.0/lgc.c.html</a> yourself and see that, unlike the 5.1 version, it only has a single mark list and doesn't use the tri-color scheme.
Timing on this is confusing. Article was written in 2020 about a paper published in 2003 or so about the design of lua 5.0.<p>It's still a really insightful and approachable paper that's worth reading, and it does still help understand the constraints and approach of lua. But the current "old" version of the language is 5.2 released in 2011, and there have been a couple major versions after that as well.<p>So some of it may not actually be applicable to using lua now, depending on what your "now" looks like.
I love that they went from a single-pass interpreter to a byte-code virtual machine, and now like PHP, also has direct access to C & system libraries! Lua has come a long way since it's advent.
I suppose that's the price for the small core but I do wish Lua had string interpolation.<p>Instead there are like 5 hacky ways to do it: <a href="http://lua-users.org/wiki/StringInterpolation" rel="nofollow">http://lua-users.org/wiki/StringInterpolation</a>
I made some seemingly similar design choices in TXR Lisp (not knowing anything about Lua or its internals).<p>- register based VM with 32 bit instruction words holding 6 bit opcodes.<p>- closures that start on the stack and are moved to the heap.<p>- single pass compiler with no SSA, Lisp straight to code
- but with additional optimization, informed by control and data flow analysis, done on the VM assembly code.
The Lua interpreters, "upvalues" sound suspiciously like the results of Tcl's, "upvar", can anybody comment on how similar they actually are?