TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The Lemon Parser Generator

86 pointsby begoonover 2 years ago

10 comments

dunhamover 2 years ago
Another tangentially related tool that I recently learned about is &quot;BNFC&quot;:<p><pre><code> Given a Labelled BNF grammar the tool produces: - an abstract syntax implementation in the target language, - a case skeleton for the abstract syntax in the target language, - a pretty-printer in the target language, - an Alex, JLex, or Flex lexer generator file , - a Happy, CUP, or Bison parser generator file, and - a LaTeX file containing a readable specification of the language. </code></pre> Targeting C, Haskell, Agda, C, C++, Java, or OCaml.<p>Might be fun to expand on this to generate tree-sitter, highlight.js, or a vscode extension.<p><a href="http:&#x2F;&#x2F;bnfc.digitalgrammars.com&#x2F;" rel="nofollow">http:&#x2F;&#x2F;bnfc.digitalgrammars.com&#x2F;</a>
414owenover 2 years ago
I&#x27;m using lemon to parse a programming language.<p>Lemon produces this report for my grammar:<p><pre><code> Parser statistics: terminal symbols................... 19 non-terminal symbols............... 42 total symbols...................... 61 rules.............................. 75 states............................. 56 conflicts.......................... 0 action table entries............... 377 lookahead table entries............ 379 total table size (bytes)........... 1433 </code></pre> It generates a parser that seems to be a reasonable size:<p><pre><code> $ wc src&#x2F;parser.c 2200 10090 78282 src&#x2F;parser.c </code></pre> I&#x27;ve tried a few other parser generators, lemon was my favourite.<p>re2c for tokenizing, and lemon for parsing =
rgovostesover 2 years ago
Tangentially related, here&#x27;s a tool I&#x27;ve wanted (someone else) to build:<p>There are many variations of how grammars are written, usually variants of Backus–Naur form, and often I find a grammar spec is published using a different variant than what is expected by the parser tool I want to use (e.g., a grammar-based fuzzer).<p>For example, Python&#x27;s grammar (<a href="https:&#x2F;&#x2F;docs.python.org&#x2F;3&#x2F;reference&#x2F;grammar.html" rel="nofollow">https:&#x2F;&#x2F;docs.python.org&#x2F;3&#x2F;reference&#x2F;grammar.html</a>) has custom syntax with a whole PEP (<a href="https:&#x2F;&#x2F;peps.python.org&#x2F;pep-0617&#x2F;" rel="nofollow">https:&#x2F;&#x2F;peps.python.org&#x2F;pep-0617&#x2F;</a>) describing the grammar of the grammar.<p>It would be nice to have a tool that can at least help with the mechanical transformation between these grammar syntaxes.
评论 #32596471 未加载
评论 #32600555 未加载
评论 #32598571 未加载
评论 #32598988 未加载
yuan43over 2 years ago
&gt; Lemon uses a different grammar syntax which is designed to reduce the number of coding errors.<p>Why does it seem as if almost every parser generator defines its own quirky grammar syntax? What&#x27;s wrong or so difficult with just accepting W3C EBNF? Who thinks it&#x27;s a good idea to force grammars to be re-written in the first place?<p>Does nobody complain about vendor lock-in due to the quirky grammar syntax they were forced to use?<p>Where are the automatic conversion utilities for these parser generators? Something that takes, say W3C EBNF and spits out the quirky parser generator grammar language? Shouldn&#x27;t that be simple?<p>I really don&#x27;t get it.
评论 #32599464 未加载
评论 #32600920 未加载
评论 #32600633 未加载
评论 #32599408 未加载
gonzusover 2 years ago
I have used this parser generator and it works like a charm.<p>The only thing I would change is the way it is distributed: the links at the end point to a sort of amalgamation of the actual sources that make up the parser generator (in the same style that is used for the SQLite source amalgamation), but I think it would be more beneficial to have access to the separate files that actually make up this amalgamation (and you could hide as static variables &#x2F; functions some of the implementation details this way).<p>Anyway, excellent tool!
评论 #32595447 未加载
评论 #32596571 未加载
j0e1over 2 years ago
&gt; The code comes with no warranty. If it breaks, you get to keep both pieces.<p>That reads much better than legalese.
zellynover 2 years ago
Here’s my port to Go: <a href="https:&#x2F;&#x2F;github.com&#x2F;gopikchr&#x2F;golemon" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;gopikchr&#x2F;golemon</a><p>It’s a little ways behind the canonical implementation, because I haven’t touched it for a while, but lemon changes very slowly if at all
dangover 2 years ago
Related:<p><i>The Lemon Parser Generator</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10295087" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10295087</a> - Sept 2015 (10 comments)<p><i>The Lemon Parser Generator</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=4473854" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=4473854</a> - Sept 2012 (2 comments)
ncmncmover 2 years ago
Nowadays, you code your grammar directly in the source language, and your parser library generates a parser at compile time as part of the normal build cycle.<p>Of course this works best in a language that supports operator overloading and compile-time operations.<p>In the old days, in C++, this would have been done with template metaprogramming, which cost various annoyances. Now no such workarounds are needed.
samatmanover 2 years ago
Quite the Baader-Meinhof effect!<p>This morning I added a commit to a fossil repo, and that commit was on a .y lemon file.<p>That&#x27;s not a typical thing I do in a day.
评论 #32596904 未加载