It's funny how projects grow like this organically.<p>Perhaps burned by experience, one time I implemented a mini-language for specifying some business logic that I knew—just <i>knew</i>—that our client would change his mind on a dozen times and not understand the half of the ramifications of his requests and would only arrive at the solution he <i>really</i> wanted by trial-and-error. Was the little language I made as complex as a "real" compiler? Goodness no. Was I happy to have a flexible language-based solution? True to my predictions, I <i>did</i> get many, many logic change request and handled them with ease. Yes, I was very happy after that.
Given how easy it is to build a Turing machine, I'd argue that building something you don't envision is the 99% of the time rule and it's already captured by <a href="https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule</a> "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp."<p>In fact I'd argue that it's much harder to not accidentally build something that you don't envision.
A funny thing is that this can go N layers deep. At a previous job we had a DSL which was very well specified, pretty straight forward, and had been written by an exceptionally brilliant engineer (who was even still at the company!). We had two backend services that would operate on that AST in a query context and in a streaming transformation context, and they shared the parser, compiler, and typechecker. The problem was that the engineering team was so fragmented and lacking in stable technical leadership that a large part of the engineering team had no idea about how these things worked, so on every project they were constantly going "Ooops, I created a feature which has its own intermediate representation with slightly different semantics so reuse is impossible". It was super hard to deal with and stuff was constantly being invented and then thrown out because it couldn't be extended.<p>The only thing worse than creating a compiler is being unaware that one already exists and creating a new one on top of the existing one.
At a billion-dollar fintech, some engineer thought gRPC is too complicated.
So, he built someything from scratch.<p>A year later, there was a team of 5 engineers maintaining a half-baked implementation of gRPC.
Good for him.
Bad for the company.
Related:<p><i>Dear sir, you have built a compiler</i> - <a href="https://news.ycombinator.com/item?id=29891428">https://news.ycombinator.com/item?id=29891428</a> - Jan 2022 (175 comments)
What's strange is that if you had started with a Lisp, it would have been much simpler!<p>And yet few people think about using a Lisp for their DSL.
It is sort of incredible how often I've ended accidentally re-inventing interpreters or compilers without that being an explicit goal.<p>You start by just adding some kind of configuration for rules, maybe in JSON. Then you start wenting to make more complex rules so you allow some kids of recursive system in your JSON that can nest rules and combine them. Then you find yourself copypasting rules a lot and so you implement some kind of naming convention so you can reuse rules. Then you realize how disgusting your JSON is getting so you dust off a parsing library and make a basic DSL that compiles into that JSON, and then it dawns on you.
Creating a configuration file? I am afraid to inform you that you have started writing a compiler. What's the only way to avoid this? Your software not being successful.
There was a joke at Uber about beginning with a configuration management system and ending up with a version control system.<p>And somewhere else about accidentally building a real time chat service (or was it email? Don’t recall).
The key is to know, acknowledge and accept where you are going, and to go boldly and deliberately- or not go at all. Say "this problem looks like making a small program so I'm making a mini-language". However, if you find yourself saying ""this looks like a database..." then stop there and please do not build a db.
I accidentally built a kind-of compiler last year.<p>It started as a few sed commands to merge TeX+code -> TeX for a book project. I ran these sed commands from a makefile. Life was easy.<p>But then there were complications, and I needed to make slightly more sophisticated substitutions. So the sed commands moved into an awk script, run by the makefile. This was better than maintaining a handful of little commands that were growing on a weekly basis. Life was good.<p>The transformations I needed kept growing a bunch of little variations, and the awk script became hard to maintain, so I rewrote it in go, with proper parsing and output. (And even unit tests, after the 2nd time I broke some output.) Designing it as almost-a-proper-compiler was 10x better than maintaining an ad hoc script. Life was great, even with the overhead of maintaining a separate processing tool.
Totally offtopic but I can't help but wonder why this guy has a site with .pl TLD. He seems to be based in the US, not Poland. Does he think "pl" stands for "programming languages"? :)
I am especially interested by the characterisation of handling interactions between different AST nodes.<p>I think interactions between features are very hard to think about.<p>I think constructed languages have the opportunity to think about potential interactions that would be useful and aim to support those ones.<p>But there's lots of permutations to features.<p>Just look at async functions in Rust and coloured functions. It's such as pain.<p>It also reminds me and brings up thoughts about "the expression problem" [0]<p>How do you think which combinations of features would be useful upfront? For example: there's interactions between memory management, garbage collection, async, multithreading, coroutines, closures, the stack, FFI. It's all very complicated.<p>[0]: <a href="https://en.wikipedia.org/wiki/Expression_problem" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Expression_problem</a>
Other needed topics in this series: “You have built a database”, “You have built an orchestrator”, “You have built an RPC layer”, “You have built a build system”…
I work on a Python project where I need to take class definitions and generate database query statements because all ORMs that currently exist don’t work for my needs. I'm currently doing this with string templates that I've defined by hand. Is there a smarter way?<p>I've looked into some compiler-like tools (can't remember the specific ones, sorry), and from what I can tell their code generation phase looks very similar to mine in that they use string templates.
... how does this even happen? What bizarre use case doesn't allow for just using an off the shelf scripting language?<p>I've only done anything like this once and not regretted it, and it's purely visual scripting, if this then that style. Anything that can't be handled by an event that triggers a list of actions, then stops when one returns False, I will hardcore a hack just for that feature.<p>Most actions and triggers are responses to specific use cases.
A compiler typically transforms programs that run in O(N) time into programs that run in O(N) time, for whatever suitable definition of N, so it's not really something to be super thrilled about.