TechEcho

11 comments

mattmightover 14 years ago

(Article author here.)I'm delighted to see this get some attention here.I absolutely love these techniques for parsing, but as my primary research areas is static analysis, I haven't had time to revise this paper and resubmit.As it stands, I may never get the time to do so. :(I posted it on arxiv so that David could reference it for his Ph.D. school apps.Since some of you have asked, here are the reviews:<a href="http://matt.might.net/papers/reviews/esop2010-derivatives.txt" rel="nofollow">http://matt.might.net/papers/reviews/esop2010-derivatives.tx...</a>I do have an updated implementation that's much cleaner and faster, and I've been planning to do that in a blog post. (Alas, no opportunity yet.)David's also done another implementation in Haskell that's screaming fast and efficient on many restricted classes of grammars (like LL(k)). I'll encourage him to post that as well.If you're interested in getting your name on a scientific publication and helping this get the attention of the scientific community, you can help us by creating an implementation of either technique in your favorite language and beating on it to help find the inefficiencies.(For instance, the original Scala version linked from this paper has memory leaks from the way it caches derivatives. We got around them by rolling top-level repetition by hand.)Please email me if that's something you're interested in doing.Can HN do science? I'd love to find out.

评论 #1949410 未加载

评论 #1951041 未加载

评论 #1949388 未加载

评论 #1953786 未加载

评论 #1949579 未加载

fizxover 14 years ago

Here's what a grammar actually looks like in their scala version. This grammar is for arithmetic over the language where x represents 1, and s represents +. This only generates the parse tree, not the final answer. Comments are added by me:<pre><code> // The Nodes in the parse tree abstract class Exp case object One extends Exp case class Sum(e1 : Exp, e2 : Exp) extends Exp // Terminals lazy val S : Parser[Char,Char] = new EqT[Char] ('s') lazy val X : Parser[Char,Char] = new EqT[Char] ('x') // Definition of an expression // I'm pretty sure all the asInstanceOfs are avoidable/unnecessary. lazy val EXP : Parser[Char,Exp] = rule(X) ==> { case x => One.asInstanceOf[Exp] } || rule(EXP ~ S ~ EXP) ==> { case e1 ~ s ~ e2 => Sum(e1,e2).asInstanceOf[Exp] } || rule(EXP ~ S ~ X) ==> { case e1 ~ s ~ e2 => Sum(e1,One).asInstanceOf[Exp] } || rule(X ~ S ~ EXP) ==> { case e1 ~ s ~ e2 => Sum(One,e2).asInstanceOf[Exp] } || rule(X) ==> { case x => One.asInstanceOf[Exp] } || rule(EXP) ==> { case e => e } || rule(Epsilon[Char]) ==> { case () => One } // Actually run the rule val xin = Stream.fromIterator("xsxsxsxsx".elements) EXP.parseFull(xin) // return value => Stream(Sum(One,Sum(Sum(One,One),Sum(One,One))), ?)</code></pre>

johkraover 14 years ago

What is actually state-of-the-art in parsing?When I, as an Amateur, last looked into it, PEG[1] and extensions like OMeta[2] seemed to be the best options. I've heard good things about Parsec[3], too.[1](<a href="http://en.wikipedia.org/wiki/Parsing_expression_grammar" rel="nofollow">http://en.wikipedia.org/wiki/Parsing_expression_grammar</a>) [2](<a href="http://www.tinlizzie.org/ometa/" rel="nofollow">http://www.tinlizzie.org/ometa/</a>) [3](<a href="http://legacy.cs.uu.nl/daan/parsec.html" rel="nofollow">http://legacy.cs.uu.nl/daan/parsec.html</a>)

评论 #1948340 未加载

评论 #1948509 未加载

评论 #1948345 未加载

DanielRibeiroover 14 years ago

Well, Antlr (<a href="http://www.antlr.org/" rel="nofollow">http://www.antlr.org/</a>) has replaced yacc for most of the practical work already. Daniel Spiewak also mentions parser combinators and how GLL parsers can be much easier to use and yet fast enough most of the time (<a href="http://www.codecommit.com/blog/scala/unveiling-the-mysteries-of-gll-part-1" rel="nofollow">http://www.codecommit.com/blog/scala/unveiling-the-mysteries...</a>). He mentions parser combinators as domain specif languages for creating parsers (<a href="http://www.codecommit.com/blog/scala/the-magic-behind-parser-combinators" rel="nofollow">http://www.codecommit.com/blog/scala/the-magic-behind-parser...</a>).Despite all of these, adoption of better techniques is really slow. As usual.

评论 #1948774 未加载

RiderOfGiraffesover 14 years ago

Fascinating. I don't understand it all yet - I'm reading it carefully but it'll be weeks before I really get it.However ...I'm getting the feeling that this encompasses properly a feeling I've had about parsing for some time, that there should be a way of parsing the entire text, with the parse settling onto the text all at the same time. I don't know if this is what it's saying, but that's the sense I get from it.But even if it isn't, it looks intriguing.

评论 #1948156 未加载

antimatter15over 14 years ago

The implementations are at <a href="http://www.ucombinator.org/projects/parsing/" rel="nofollow">http://www.ucombinator.org/projects/parsing/</a>

评论 #1951033 未加载

benblackover 14 years ago

So many useful links in this discussion, I consolidated them (and a few extras) here <a href="http://post.b3k.us/modern-parsing-bibliography" rel="nofollow">http://post.b3k.us/modern-parsing-bibliography</a>

amichailover 14 years ago

Why was it rejected by ESOP?

评论 #1948532 未加载

评论 #1948356 未加载

mahmudover 14 years ago

Brzozowski's derivatives of regular expressions has been in use in the PLT compiler tools for ages now.

herdrickover 14 years ago

Better comments at programming.reddit, sadly: <a href="http://www.reddit.com/r/programming/comments/ed2pb/yacc_is_dead/" rel="nofollow">http://www.reddit.com/r/programming/comments/ed2pb/yacc_is_d...</a>

kleibaover 14 years ago

Do I hear John McCarthy chuckle?

11 comments

mattmightover 14 years ago

评论 #1949410 未加载

评论 #1951041 未加载

评论 #1949388 未加载

评论 #1953786 未加载

评论 #1949579 未加载

fizxover 14 years ago

johkraover 14 years ago

评论 #1948340 未加载

评论 #1948509 未加载

评论 #1948345 未加载

DanielRibeiroover 14 years ago

评论 #1948774 未加载

RiderOfGiraffesover 14 years ago

评论 #1948156 未加载

antimatter15over 14 years ago

The implementations are at <a href="http://www.ucombinator.org/projects/parsing/" rel="nofollow">http://www.ucombinator.org/projects/parsing/</a>

评论 #1951033 未加载

benblackover 14 years ago

amichailover 14 years ago

Why was it rejected by ESOP?

评论 #1948532 未加载

评论 #1948356 未加载

mahmudover 14 years ago

Brzozowski's derivatives of regular expressions has been in use in the PLT compiler tools for ages now.

herdrickover 14 years ago

kleibaover 14 years ago

Do I hear John McCarthy chuckle?

Yacc is dead

11 comments

Yacc is dead

11 comments