I used Lex and Yacc (as the article mentions, the direct predecessors of the Flex and Bison it talks about) a bit a long time ago, but more recently when I've written parsers I've just used recursive descent - but I was parsing a well-defined grammar. It did leave me with the knowledge of how to describe grammars in Yacc though, which I have found useful.<p>Guy Steele said something about a use of Yacc for language <i>design</i> (rather than implementation) that stuck with me :<p><i>Be sure that your language will parse. It seems stupid to sit down and start designing constructs and not worry how they will fit together. You can get a language that's difficult if not impossible to parse, not only for a computer, but for a person. I use YACC constantly as a check of all my language designs, but I very seldom use YACC in the implementation. I use it as a tester, to be sure that it's LR(1) ... because if a language is LR(1) it's more likely that a person can deal with it.</i><p>From the Dynamic Languages Wizards series (in 2001), in the panel on language design (1:09:05) [1]<p>I've not yet employed Yacc in this fashion, but it did give me a tool for thinking about object models. A while ago when I was puzzling over how some classes in an entity relationship diagram should be related, and I considered it from the point of view of how would I design a grammar for serializing an instance of the model into text. This essentially made my decision for me in a principled way, though I never reached the point of writing up a grammar for the whole model, just considered the implications for the local bit that was troubling me.<p>[1]
<a href="https://youtu.be/agw-wlHGi0E?si=n-ann0TYjvZ45ie5&t=4145" rel="nofollow noreferrer">https://youtu.be/agw-wlHGi0E?si=n-ann0TYjvZ45ie5&t=4145</a><p>edit: added a few clarifying notes
OK, the basics. But do not stop reading here if you want to write a parser. There are more modern tools to look at (e.g., antlr).<p>Warning 1: parsing Unicode streams well is awkward with flex -- it's from an age where ASCII ruled. But handling multiple input incodings may get weird. If it is only UTF-8, maybe it works, because that's essentially bytes. But I find a hand-written scanner more convenient (the grammar is seldom too complex for that). But regexps based on General_Category or ID_Start etc.? Difficult...<p>Warning 2: for various reasons, usually flexibility, conflict resolving, error reporting, and/or error recovery, many projects move from bison to something else, even a handwritten recursive descent parser. It's longer, but not that difficult.
If lex/yacc style parsing works for you, then great. However, I suspect most people are going to get more mileage out of just hand writing a recursive descent parser and moving on with their lives.<p>The benefit of recursive descent is that they're easy to write and modify and understand. You don't need any new paradigms, just write code like you typically do. If something goes wrong, your standard debugging skills will serve you well.<p>There's also a lot of other relatively easy parsing technologies out there. For example, you can also consider monadic parsing, parser combinators, PEG libraries.<p>I spent a year trying to figure out which parser technique worked best for me, and I'm glad I didn't just stick with my starting point of lex/yacc. So again, if this guide allows parsing to just work for you, then great stick with it. But if you find yourself encountering a lot of problems, then it might be worth it to look around because other options exist and work just fine.
After taking a compiler course in uni I found the emphasis on dealing with syntax mostly a waste of time. To begin with, do yourself a favor and use S-expression syntax (like Lisp) for your language. They're dead simple to parse. With the syntax out of the way, you can get to meat and potatoes of implementing a language. Later on you can always define a "look" for your language, and you can spend an inordinate amount of time on that.
Kind of related, for anyone curious with parsing and JS: I have to recommend peggy for writing simple parsers for files to be consumed by JavaScript. Pretty niche, but does it so well. I developed a few packages using it so far.