Contrary to the naysayers, I like seeing stuff like this. Why? Because it's a simple, gentle introduction. It's easily digestible for the newcomer. And it might be easy enough to encourage a newcomer to start building their own VM that goes on to be something real.<p>For those that criticize it and find faults with it -- I'm sure the author would consider pull requests. Or you could provide your own fork with all the improvements that you believe are necessary.
I'd like to see such an article on a register based VM. Pawn and Lua are nice examples. Most VMs are stack based but this is mainly because they are conceptually easier to understand. Register based machines have some real advantages, like requiring far fewer instructions inside tight loops.
I'm a bit disappointed - this VM doesn't have instructions for looping or branching, nor does it really use the registers in any way. I was hoping to read a writeup that introduced some concepts that were used in real (non-toy) systems.
I'm pretty sure everyone has wrote their own toy VMs, but I'll go ahead and throw mine out there. (well, 1 of the 3 I've wrote that I like best). It's called LightVM and is intended to be capable of running on tiny microcontrollers.<p>The most cool thing I like about it is the opcodes and registers are extremely general purpose. So, to do a branch, you do `mov IP, label`, or even a "push.mv" instruction which when used against IP is basically the same as the usual "call" instruction, but can also be used with data registers to save a register to the stack and then set it to a value.<p>I've found the hardest thing about making a VM isn't making a VM, but rather making the infrastructure around it (assembler, debugger, compilers, etc)<p><a href="https://bitbucket.org/earlz/lightvm/overview" rel="nofollow">https://bitbucket.org/earlz/lightvm/overview</a>
For those who want to implement a VM as an exercise, I recommend to implement a simple JIT-compiler after that. You'll probably be impressed at performance improvements and it's funny exercise to do. I used GNU lightning to generate machine code.
I am starting to sound like a broken record, but here it goes. If you want a more complete tutorial on writing stack based virtual machines, check "The Elements of Computing Systems" and its accompanying course, <a href="http://www.nand2tetris.org/" rel="nofollow">http://www.nand2tetris.org/</a>.<p>The book teaches you to build:<p>1) A CPU from basic electronics elements<p>2) An assembler to generate machine code<p>3) A bytecode VM that can be simulated and an assembler generator from the bytecode<p>4) A basic programming language that generates bytecode<p>5) An operating system using that language.<p>I'm midway through building the Assembler and VM myself :-).
This project is nice for educational purposes, but I wouldn't call it a VM, but instead a "bytecode interpreter".<p>I think nowadays it is kind of a minimum requirement to have the intermediate code JIT-compiled (or at least compiled).<p>I'm also missing a garbage collector, although that is not necessarily part of a VM (but often is). See NaCl for a counterexample. By the way, a project that I'd like to see is an efficient garbage collector implemented inside the VM, instead of as being part of the VM.
For everyone who enjoyed this or wants to take it a step further, I recommend writing a CHIP-8 emulator. I used the following source: <a href="http://www.multigesture.net/articles/how-to-write-an-emulator-chip-8-interpreter/" rel="nofollow">http://www.multigesture.net/articles/how-to-write-an-emulato...</a> and it was very helpful.
For people looking for less "toy" implementations, I've written two emulators, an 8086 one and a Z80 one.<p>There's libz80 (<a href="https://github.com/ggambetta/libz80" rel="nofollow">https://github.com/ggambetta/libz80</a>) which is (AFAIK) quite complete and correct but just a library, and the 8086 one (<a href="https://github.com/ggambetta/emulator-backed-remakes" rel="nofollow">https://github.com/ggambetta/emulator-backed-remakes</a>) which is incomplete and buggy but serves a much more interesting purpose :)
While seemingly simple, the simple non-turing example is not too far off from the (simple) Forth-like stack-based programming language found and executed in bitcoin transactions.<p><a href="https://en.bitcoin.it/wiki/Script" rel="nofollow">https://en.bitcoin.it/wiki/Script</a>
C is not my thing so a few years ago trying to sort out how a VM works, I created a VM in Ruby.<p>Practical? Not in the least. But, it was a good weekend's worth of fun.<p><a href="https://github.com/patrickjonesdotca/carban" rel="nofollow">https://github.com/patrickjonesdotca/carban</a>
For a project that goes a bit deeper (branching, i/o, etc) consider writing a Chip8 simulator. There's lots of games written in chip8 bytecode to test with!
I find these kinds of very basic intro articles frustrating. They till the same ground over and over: a tiny instruction set implemented with a switch statement. None of the more difficult issues are addressed: exception handling, linking to libraries or other programs written for the same VM, portability of programs across architectures, accessing the OS for services like file I/O, time, etc.-- All the things that make a toy not a toy.<p>Every CS student in the world has written a toy VM just like this one.
I think the title is very misleading. This is not a virtual machine but an interpreter for a made up assembly language. There is nothing wrong with that and I am sure a beginner would find it very useful. But reading the title I was expecting something quite different.