Another tangentially related tool that I recently learned about is "BNFC":<p><pre><code> Given a Labelled BNF grammar the tool produces:
- an abstract syntax implementation in the target language,
- a case skeleton for the abstract syntax in the target language,
- a pretty-printer in the target language,
- an Alex, JLex, or Flex lexer generator file ,
- a Happy, CUP, or Bison parser generator file, and
- a LaTeX file containing a readable specification of the language.
</code></pre>
Targeting C, Haskell, Agda, C, C++, Java, or OCaml.<p>Might be fun to expand on this to generate tree-sitter, highlight.js, or a vscode extension.<p><a href="http://bnfc.digitalgrammars.com/" rel="nofollow">http://bnfc.digitalgrammars.com/</a>
I'm using lemon to parse a programming language.<p>Lemon produces this report for my grammar:<p><pre><code> Parser statistics:
terminal symbols................... 19
non-terminal symbols............... 42
total symbols...................... 61
rules.............................. 75
states............................. 56
conflicts.......................... 0
action table entries............... 377
lookahead table entries............ 379
total table size (bytes)........... 1433
</code></pre>
It generates a parser that seems to be a reasonable size:<p><pre><code> $ wc src/parser.c
2200 10090 78282 src/parser.c
</code></pre>
I've tried a few other parser generators, lemon was my favourite.<p>re2c for tokenizing, and lemon for parsing =
Tangentially related, here's a tool I've wanted (someone else) to build:<p>There are many variations of how grammars are written, usually variants of Backus–Naur form, and often I find a grammar spec is published using a different variant than what is expected by the parser tool I want to use (e.g., a grammar-based fuzzer).<p>For example, Python's grammar (<a href="https://docs.python.org/3/reference/grammar.html" rel="nofollow">https://docs.python.org/3/reference/grammar.html</a>) has custom syntax with a whole PEP (<a href="https://peps.python.org/pep-0617/" rel="nofollow">https://peps.python.org/pep-0617/</a>) describing the grammar of the grammar.<p>It would be nice to have a tool that can at least help with the mechanical transformation between these grammar syntaxes.
> Lemon uses a different grammar syntax which is designed to reduce the number of coding errors.<p>Why does it seem as if almost every parser generator defines its own quirky grammar syntax? What's wrong or so difficult with just accepting W3C EBNF? Who thinks it's a good idea to force grammars to be re-written in the first place?<p>Does nobody complain about vendor lock-in due to the quirky grammar syntax they were forced to use?<p>Where are the automatic conversion utilities for these parser generators? Something that takes, say W3C EBNF and spits out the quirky parser generator grammar language? Shouldn't that be simple?<p>I really don't get it.
I have used this parser generator and it works like a charm.<p>The only thing I would change is the way it is distributed: the links at the end point to a sort of amalgamation of the actual sources that make up the parser generator (in the same style that is used for the SQLite source amalgamation), but I think it would be more beneficial to have access to the separate files that actually make up this amalgamation (and you could hide as static variables / functions some of the implementation details this way).<p>Anyway, excellent tool!
Here’s my port to Go: <a href="https://github.com/gopikchr/golemon" rel="nofollow">https://github.com/gopikchr/golemon</a><p>It’s a little ways behind the canonical implementation, because I haven’t touched it for a while, but lemon changes very slowly if at all
Nowadays, you code your grammar directly in the source language, and your parser library generates a parser at compile time as part of the normal build cycle.<p>Of course this works best in a language that supports operator overloading and compile-time operations.<p>In the old days, in C++, this would have been done with template metaprogramming, which cost various annoyances. Now no such workarounds are needed.
Quite the Baader-Meinhof effect!<p>This morning I added a commit to a fossil repo, and that commit was on a .y lemon file.<p>That's not a typical thing I do in a day.