This post by Mike Pall has a bit more depth about the assembly interpreter details, particularly about stuff like branch prediction: <a href="http://article.gmane.org/gmane.comp.lang.lua.general/75426" rel="nofollow">http://article.gmane.org/gmane.comp.lang.lua.general/75426</a>
Interesting article. If you're interested in the labels as values approach he mentioned, I read a recent article by Eli Bendersky, here: <a href="http://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables/" rel="nofollow">http://eli.thegreenplace.net/2012/07/12/computed-goto-for-ef...</a>
"LuaJIT 2 has interpreters for 6 architectures of around 4000K lines per architecture (ARMv6, MIPS, PPC, PPCSPE/Cell, x86, x86-64)."<p>Is that number 4000K correct? That sounds awfully big for me.
Andy Wingo's exploration of V8's Lithium interpreter [1] describes the reasoning behind the Ruby offline assembly generator.<p>Actually, if you're reading the original linked article, you should also be reading Andy's series on V8.<p>[1] <a href="http://wingolog.org/archives/2012/06/27/inside-javascriptcores-low-level-interpreter" rel="nofollow">http://wingolog.org/archives/2012/06/27/inside-javascriptcor...</a>
Anton Ertl and David Gregg cracked this nut a <i>long</i> time ago with gForth / vmgen! It's only when you add registers to their basic two-stack model, like Parrot does, that you gain any more speed.