科技回声

9 条评论

stiff超过 10 年前

I am at a vacation and bored, so I decided to try to actually understand this. I do not know much APL/K/J, but for a start, here is a version with more traditional line breaking:<a href="https://gist.github.com/anonymous/f72e5c4a432492abce59" rel="nofollow">https://gist.github.com/anonymous/f72e5c4a432492abce59</a>Some basic observations:- He uses all the obscure C features, like the ability to not declare the type of an argument or return value and let the compiler fill in, and also the old school style of declaring functions:<pre><code> foo(x,y) int x, y { return 0; } </code></pre> Instead of:<pre><code> int foo(int x, int y) { return 0; } </code></pre> Since the compiler will infer the int, and since he uses R for return this example eventually becomes:<pre><code> foo(x,y){R(0);} </code></pre> - Variables, or really registers, are only accessible as letters from a to z, and the "st" array stores the values of all registers. Numbers entered in the REPL have to be between 0 and 9, and hence he avoids the dirty work of making a proper lexer. It's also super easy to trigger a segfault since any error handling is non-existent.- DO(n,x) is a C macro that evaluates the given expression "x" for all numbers between 0 and "n"- V1 is a C macro that defines unary operators for the interpreted language, and V2 defines binary operators. In V1 definitions the operand is called "w", in V2 the operands are called "a" and "w".- For example ",", which calls the cat function, is a binary operator that creates vectors:<pre><code> 1,2,3,4 4 1 2 3 4 </code></pre> - The vt, vd and vm arrays map ascii symbols to the functions defined with V1 and V2. { is the second symbol in vt, so when used as a unary operator it calls "size" (second non-null element of vm):<pre><code> {5,6,7,8 4 </code></pre> and when used as a binary operator it calls "from" (second non-null element of vd).- wd is a parser that goes from the original input string to a weird intermediate form that is an array of longs. Each input character gets mapped one-to-one to an item in this intermediate form.<pre><code> If the input character was a number between 0 and 9: Value type instance gets allocated Intermediate form for this input character consists of the address of the allocated instance If the character is a letter between "a" and "z": Intermediate form consists simply of this character If the character represents an operator Intermediate form consists of the index of the operator in the vt array </code></pre> In other words, the intermediate form is an array where some elements are ascii characters, others are memory addresses and yet other indices into some array. This part is really something.- The ex function executes the intermediate form. Since everything in the input is fixed length, and there is no syntax checking, it just indexes into the intermediate form assuming everything is well formed, while the parser did not check that so it's not really guaranteed - again a source of easy segfaults. The execution goes from left to right and consists of looking at the first position in the intermediate form and then making recurrent calls if necessary (let X be the current item in the intermediate form):<pre><code> If X is a character Lookahead one item If it is a '=' char Assign the result of executing everything after the '=' to the register indicated by X Assign to X the value of the register named by X If X is not a character and is a small integer We are applying a binary operator X is the index into the "vm" array Fetch the function from "vm", apply it to the result of executing the rest of the intermediate form Otherwise: If there is any more input remaining other than the current item, we are applying a binary operator Lookup the function in "vt", apply it to the result of executing the intermediate form to the left and to the right of the operator </code></pre> - I have the biggest problem with understanding that "a" struct, that represents all values in the interpreted language, which are arrays. ga is clearly the basic allocation function for it, "plus" obviously adds two arrays, so it's clear the "p" field holds the actual contents, but that's where things get very shady.

评论 #8536002 未加载

评论 #8535726 未加载

评论 #8535730 未加载

nine_k超过 10 年前

This is a perfect illustration why normal people (like me, unlike Arthur) have problems working with his apparently brilliant creations, such as the k language. The amount of information required to efficiently read (let alone write) such texts just does not fit into normal people's working memory. One-letter names and the absence of comments don't help either.(WRT weird names: IIRC, some time ago kdb featured a number of functions named with digits, like '2' being the function to connect to a socket or something.)

评论 #8535181 未加载

golemotron超过 10 年前

I want to offer a bounty for a version that compiles with a contemporary C compiler. There's some magic going on with longs being treated as pointers here.

评论 #8536011 未加载

FullyFunctional超过 10 年前

This transcript of Roger Hui's talk casts more light on the general idea: <a href="http://archive.vector.org.uk/trad/v094/hui094_85.pdf" rel="nofollow">http://archive.vector.org.uk/trad/v094/hui094_85.pdf</a>It's interesting that what he calls "Parsing" is more typically known as term reduction by pattern matching in the functional world.

marktangotango超过 10 年前

He "cheats" by putting multiple statements on one line:noun(c){A z;if(c<'0'||c>'9')R 0;z=ga(0,0,0);z->p=c-'0';R z;}would be<pre><code> noun(c){ A z; if(c<'0'||c>'9') R 0; z=ga(0,0,0); *z->p=c-'0'; R z; } </code></pre> In this forms, it's still cryptic, but not quiet as inscrutable.

评论 #8534874 未加载

评论 #8535687 未加载

chipsy超过 10 年前

When code is bulky, you can excuse your ego because your visibility is limited in a literal sense. But when it's dense like this you actively know, as you try and fail to skim it, that you can only usefully focus on a small segment at a time. So you blame the code instead of yourself.

评论 #8536005 未加载

ha292超过 10 年前

Isn't it wonderful that many wall st firms use k/q to write some of the most complex analytics ?Of course, it is understandable. NOT.

评论 #8535984 未加载

PhasmaFelis超过 10 年前

Interpreter of what? "J", apparently, from context and inference, but that should have been in the post title.What's an "interpreter fragment"? Most of the Google results for that term point to the same story.I know this is Hacker News and everyone is supposed to know everything about everything already, but on obscure topics it's nice to take a moment and explain what you're talking about.

cplease超过 10 年前

Complete shit. C was a mature language in 1989. Even for banging out a quick hack, and forgiving the gets into a 99 byte stack array, this is crap by any standard.I can compile and run pretty much any K&R code from the C Programming Language in 1978 with no problem. Finding the appropriate compiler, I can do the same with BCPL, the Language and its Compiler (1981). And the code is actually understandable.This, on the other hand, is an obfuscated mess that segfaults on any input, J or otherwise. I'll be damned if I spend 5 minutes debugging it, ex() segfaults with infinite recursion. It's not worth the trouble.Not saying this isn't worth sharing as a curiousity or an artifact, but this is not a work of genius and it is not defensible or to be emulated. Do your colleagues a favor and don't write like this.

评论 #8536076 未加载

评论 #8536003 未加载

9 条评论

stiff超过 10 年前

评论 #8536002 未加载

评论 #8535726 未加载

评论 #8535730 未加载

nine_k超过 10 年前

评论 #8535181 未加载

golemotron超过 10 年前

I want to offer a bounty for a version that compiles with a contemporary C compiler. There's some magic going on with longs being treated as pointers here.

评论 #8536011 未加载

FullyFunctional超过 10 年前

marktangotango超过 10 年前

评论 #8534874 未加载

评论 #8535687 未加载

chipsy超过 10 年前

评论 #8536005 未加载

ha292超过 10 年前

Isn't it wonderful that many wall st firms use k/q to write some of the most complex analytics ?Of course, it is understandable. NOT.

Arthur Whitney's One-page Interpreter (1992)

9 条评论

Arthur Whitney's One-page Interpreter (1992)

9 条评论