科技回声

6 条评论

lisper将近 7 年前

At the end, Kegler links to this comprehensive overview of the history of parsing:<a href="https://jeffreykegler.github.io/personal/timeline_v3" rel="nofollow">https://jeffreykegler.github.io/personal/timeline_v3</a>which contains this easily overlooked but IMHO extremely significant statement:"a recursive descent implementation can parse operator expressions as lists, and add associativity in post-processing"Personally, it has always seemed like a no-brainer to me that this is clearly the Right Answer. It is a mystery to me that the computing world at large has spent so much effort on a problem whose solution is actually very straightforward if you just give in on one tiny little piece of theoretical purity.See <a href="http://www.flownet.com/ron/lisp/parcil.lisp" rel="nofollow">http://www.flownet.com/ron/lisp/parcil.lisp</a> for my own implementation of such a parser. As you will see if you count LOCs, it's very, very simple by parser standards, and yet it handles all the "hard" problems: associativity, precedence, infix and prefix operators.

评论 #17496149 未加载

评论 #17494900 未加载

评论 #17496205 未加载

评论 #17494604 未加载

joe_the_user将近 7 年前

The thing is that writing a parser requires that a person to understand what a formal language is. Overall, only a subset of programmers understand even this, so parsing has a certain inherent hardness to it (you can't just use a library or just use an object).Of course, the problem of how to create a parser is solvable any number of ways if you mean how to convert an unambiguously specified formal language into a parser. But that doesn't mean basic challenges don't remain. Especially because a formal language is hard to understand (and can be ambiguous) and because what one wants the language to actually do something, there is a further trickiness involved (you have to bridge interface between syntax and semantics). So which way to solve the problem of parsing become a complex decision. But it's not so much "we don't know how to efficiently do this yet" but rather "there is no one-size fits all approach."

评论 #17494376 未加载

agumonkey将近 7 年前

Superb website with loads of content.This <a href="https://jeffreykegler.github.io/personal/timeline_v3" rel="nofollow">https://jeffreykegler.github.io/personal/timeline_v3</a> is also worth your time twofolds.

评论 #17494693 未加载

PhantomGremlin将近 7 年前

If parsing is "complicated", then there's another solution. Don't play the game. Change the rules. Play a different game.My understanding (and, since this is the Interwebs I will quickly be corrected if I'm wrong) is that Python is easy to parse; a lot of the battles about adding features to the language involve keeping the grammar simple.And yet Python is eminently useful, despite being simple to parse.I'm reminded of how we didn't understand how to specify a simple grammar in the "good old days". E.g. take ancient FORTRAN.The for-loop in FORTRAN is actually called do. And you specify the end of the loop by numerical statement label (found in columns 1 thru 5). Thus:<pre><code> DO 10 I = 1, 7 some stuff here, loop done for I = 1,2,3,4,5,6,7 10 final line of loop </code></pre> But spaces aren't significant. So if you write the following statement<pre><code> DO 10 I = (1, 7) </code></pre> You get something totally different. You set the value of the complex variable DO10I to (1,7). Bheech. Who wants to parse that? (And yet, there were very capable FORTRAN compilers back in the 1960s!)

评论 #17494551 未加载

CalChris将近 7 年前

I’m a little surprised that ANTLR and L* don’t make the list (ANTLR from the practitioner POV and L* from theory).

lower将近 7 年前

I'm sorry, but this is just rambling. He goes on about theorists and practitioners without actually saying anything about the problem at all. He doesn't explain why he thinks the current state of the art isn't the solution. What does he want to do that isn't handled well?There are many ways in which parsing can be improved in practice and theory. Actual technical aspects would be more interesting.

6 条评论

lisper将近 7 年前

评论 #17496149 未加载

评论 #17494900 未加载

评论 #17496205 未加载

评论 #17494604 未加载

joe_the_user将近 7 年前

评论 #17494376 未加载

agumonkey将近 7 年前

评论 #17494693 未加载

PhantomGremlin将近 7 年前

评论 #17494551 未加载

CalChris将近 7 年前

I’m a little surprised that ANTLR and L* don’t make the list (ANTLR from the practitioner POV and L* from theory).

lower将近 7 年前

Undershoot: Parsing theory in 1965

6 条评论

Undershoot: Parsing theory in 1965

6 条评论