Author here.<p>As I wrote in the README, this uses GAWK instead of plain AWK. Conveniences of GAWK over AWK in a nutshell:
- functions
- several additional functions (eg. bit shifting)<p>But even GAWK lacks some things that are very common in other languages:
- no variable scope: imagine a calling another function in a for loop and the other function again running a for loop. if both loops use 'i' as the counter, good luck. the workaround for this is to declare the local variables as parameters that are not passed (and separate them with four spaces)
- cannot return array from a function. the workaround is to use pass-by-reference (not sure if the precise definition is applicable here)
- arrays cannot be assigned to another variable. workaround is to loop over array and assign it value by value.<p>If anybody knows better workarounds, please let me know :)
Not the main point, and this is a very cool dancing bear. But, on this point:<p><i>"since none of the awks can read binary, you first need to pipe the classfile through hexdump"</i><p>Gawk works fine with binary files for me. Using FIELDWIDTHS for fixed length records or the readfile() extension to slurp in a whole file works fine. The readline() function can also be paired with FIELDWIDTHS to read a fixed number of bytes. Newline separated records with nulls in them also read as expected. I'm curious what problems the author saw with binary and gawk.
The AWK Programming Language book is great not just for learning AWK but also has chapters on data processing, generating tables and graphs, relational databases and even a VM! I've implemented the assembler and VM from the book and extended the instruction set.[0]<p>[0] <a href="https://github.com/siraben/awk-vm" rel="nofollow">https://github.com/siraben/awk-vm</a>
METHODS[m]["attributes"][a]["data"][4]<p>Is this a five dimensional array? How does it get populated?<p>EDIT: I see it now in the code. Excellent!
My first thought reading the headline was that this was an implementation of Awk that ran on a JVM. But no, this is the reverse of that. This is way is far less useful but infinitely more interesting. Bravo!
Added a pull request that wraps typeof() and calls a "polyfill" if it the typeof() function doesn't exist. Gawk didn't have that until v4.2.