Tilde, My LLVM Alternative

361 pointsby davikr4 months ago

27 comments

ksec4 months ago

>I'm calling it Tilde (or TB for tilde backend) and the reasons are pretty simple, i believe it's far too slow at compiling and far too big to be fixed from the inside. It's been 20 years and cruft has built up, time for a "redo".That put a smile on my face because I remember that was how LLVM was born out of frustration with GCC.I dont know how the modern GCC and LLVM compares, I remember LLVM was fast but resulting binary were not as optimised, once those optimisation added it became slower. While LLVM was a wake up call to modernise GCC and make it faster. In the end competition made both a lot better.I believe some industry ( Gaming ) used to swear by VS Studio / MS Compiler / Intel Compiler or languages that depends / prefer the Borland ( What ever they are called now ) compiler. Its been very long since I last looked I am wondering if those two are still used or have we all merged mostly into LLVM / GCC?

评论 #42812765 未加载

评论 #42813227 未加载

评论 #42813154 未加载

评论 #42817900 未加载

评论 #42814750 未加载

melodyogonna4 months ago

Chris Lattner seems to have also created an alternative for LLVM - <a href="https://mlir.llvm.org/" rel="nofollow">https://mlir.llvm.org/</a>Because of how the architecture works, LLVM is one of the backends, but it doesn't have to be. Very interesting project, you could do a lot more IR processing before descending to LLVM (if you use that), that way you could give LLVM a lot less to do.Chris has said LLVM is fast at what it is designed to do - lower IR to machine code. However, because of how convoluted it can get, and the difficulty involved in getting information from some language-specific MIR to LLVM, languages are forced to generate tons upon tons of IR so as to capture every possible detail. Then LLVM is asked to clean up and optimize this IR.One thing to look out for is the problem of either losing language-specific information when moving from MIR to Low-level IR (be it Tilde or LLVM) or generating too much information, most of it useless.

评论 #42814833 未加载

评论 #42812381 未加载

mtlynch4 months ago

I saw Yasser present this at Handmade Seattle in 2023.[0] He explained that when he started working on Tilde, he didn't have any special knowledge or interest in compilers. But he was reading discussions in the Handmade forums, and one of the most popular requests was for an alternative to LLVM, so he thought, "Sure, I'll do that."[0] <a href="https://handmadecities.com/media/seattle-2023/tb/" rel="nofollow">https://handmadecities.com/media/seattle-2023/tb/</a>

Rochus4 months ago

Cool. The author has set himself a huge task if he wants to build something like LLVM. An alternative would be to participate in a project with similar goals that is already quite progressed, such as QBE or Eigen (<a href="https://github.com/EigenCompilerSuite/">https://github.com/EigenCompilerSuite/</a>) ; both so far lack of optimizers. I consider Eigen very attractive because it supports much more targets and includes assemblers and linkers for all targets. I see the advantage in having a C implementation; Eigen is unfortunately developed in C++17, but I managed to backport the parts I'm using to a moderate C++11 subset (<a href="https://github.com/rochus-keller/Eigen">https://github.com/rochus-keller/Eigen</a>). There are different front-ends available, two C compilers among them. And - as mentioned - an optimizer would be great.EDIT: just found this podcast where the author gives more informations about the project goals and history (at least the beginning of the podcast is interesting): <a href="https://www.youtube.com/watch?v=f2khyLEc-Hw" rel="nofollow">https://www.youtube.com/watch?v=f2khyLEc-Hw</a>

评论 #42812926 未加载

评论 #42840328 未加载

fguerraz4 months ago

Looking at the commit history inspires some real confidence!<a href="https://github.com/RealNeGate/Cuik/commits/master/">https://github.com/RealNeGate/Cuik/commits/master/</a>

评论 #42813146 未加载

评论 #42813888 未加载

评论 #42813164 未加载

muizelaar4 months ago

I thought the sea-of-nodes choice was interesting.V8 has been moving away from sea-of-nodes. Here's a video where Ben Titzer is talking about V8's reasons for moving away from sea-of-nodes: <a href="https://www.youtube.com/watch?v=Vu372dnk2Ak&t=184s" rel="nofollow">https://www.youtube.com/watch?v=Vu372dnk2Ak&t=184s</a>. Yasser, the author of Tilde, is is also in the video.

评论 #42814935 未加载

gergo_barany4 months ago

> a decent linear scan allocator which will eventually be replaced with graph coloring for optimized builds.Before setting out to implement 1980s-style graph coloring, I would suggest considering SSA-based register allocation instead: <a href="https://compilers.cs.uni-saarland.de/projects/ssara/" rel="nofollow">https://compilers.cs.uni-saarland.de/projects/ssara/</a> , I find the slides at <a href="https://compilers.cs.uni-saarland.de/projects/ssara/hack_ssara_ssa09.pdf" rel="nofollow">https://compilers.cs.uni-saarland.de/projects/ssara/hack_ssa...</a> especially useful.Graph coloring is a nice model for the register assignment problem. But that's a relatively easy part of overall register allocation. If your coloring fails, you need to decide what to spill and how. Graph coloring does not help you with this, you will end up having to iterate coloring and spilling until convergence, and you may spill too much as a result.But if your program is in SSA, the special properties of SSA can be used to properly separate these subphases, do a single spilling pass first (still not easy!) and then do a coloring that is guaranteed to succeed.I haven't looked at LLVM in a while, but 10-15 years ago it used to transform out of SSA form just before register allocation. If I had to guess, I would guess it still does so. Not destroying SSA too early could actually be a significant differentiator to LLVM's "cruft".

评论 #42822227 未加载

mungaihaha4 months ago

> I believe it's (LLVM) far too slow at compiling and far too big to be fixed from the insideWhat are you doing to make sure Tilde does not end up like this?

评论 #42816928 未加载

评论 #42815355 未加载

评论 #42812171 未加载

laweijfmvo4 months ago

<pre><code> > It's been 20 years and cruft has built up, time for a "redo". </code></pre> Ah.. is this one of those "I rewrote it and it's better" things, but when people inevitably discover issues that "cruft" was handling the author will blame the user?

评论 #42815071 未加载

cfiggers4 months ago

Tsoding explored this project on a recent stream: <a href="https://youtu.be/aKk_r9ZwXQw?si=dvZAZkOX3xd7yjTw" rel="nofollow">https://youtu.be/aKk_r9ZwXQw?si=dvZAZkOX3xd7yjTw</a>

评论 #42815714 未加载

muke1014 months ago

If you're going to rewrite LLVM, you should avoid just trying to 'do it again but less bloated', because that'll end up where LLVM is now once you've added enough features and optimisation to be competitive.Rewriting LLVM gives you the opportunity to rethink some of its main problems. Of those I think two big ones include Tablegen and peephole optimisations.The backend code for LLVM is awful, and tablegen only partially addresses the problem. Most LLVM code for defining instruction opcodes amounts to multiple huge switch statements that stuff every opcode into them, its disgusting. This code is begging for a more elegant solution, I think a functional approach would solve a lot of the problems.The peephole optimisation in the InstCombime pass is a huge collection of handwritten rules that's been accumulated over time. You probably don't want to try and redo this yourself but it will also be a big barrier to achieving competitive optimisation. You could try and solve the problem by using a superoprimisation approach from the beginning. Look into the Souper paper which automatically generates peepholes for LLVM: (<a href="https://github.com/google/souper">https://github.com/google/souper</a>, <a href="https://arxiv.org/pdf/1711.04422.pdf" rel="nofollow">https://arxiv.org/pdf/1711.04422.pdf</a>).Lastly as I hate C++ I have to throw in an obligatory suggestion to rewrite using Rust :p

评论 #42813711 未加载

einpoklum4 months ago

I'm not familiar with a lot of the acronyms and catch-phrases already in the first part of the article... let me try to make a bit of sense of this:<pre><code> IR = Intermediate Representation SSA = Single Static Assignment CFG = Control-Flow Graph (not Context-Free Grammar) </code></pre> And "sea of nodes" is this: <a href="https://en.wikipedia.org/wiki/Sea_of_nodes" rel="nofollow">https://en.wikipedia.org/wiki/Sea_of_nodes</a> ... IIANM, that means that instead of assuming a global sequence of all program (SSA) instructions, which respects the dependecies - you only have a graph with the partial order defined by the dependencies, i.e. individual instructions are nodes that "float" in the sea.

评论 #42813749 未加载

ThatGuyRaion4 months ago

I appreciate several things about this compiler already:MIT license (the king of licenses, IMHO. That's not an objective statement though)Written in easy to understand C.No python required to build it. (GCC requires perl, and I think that Perl is way easier to bootstrap then Python for LLVM)No Apple. I don't know if you all have seen some of the Apple developers talking with people but some of them are extremely condescending and demeaning towards people of non-formal CS backgrounds. I get it, in a world where you are one of the supreme developers it's easy to be that way but it's also just as bad as having people like Theo or historical Torvalds.

mshockwave4 months ago

I’m definitely happy to see this happening. But I would like to point out two ingredients that constitute LLVM’s success beyond academic merits: License and modularity. I’m not a lawyer so can’t say much about the first one, all I can say is that I believe license is one of the main reasons Apple switched to LLVM decades ago. Modularity, on the other hand, is one of the most crucial features of LLVM and something GCC struggles to catch up even nowadays. I really hope Tilde can adopt the modularity philosophy, provide building blocks rather than just tools

评论 #42815176 未加载

评论 #42816337 未加载

muth024464 months ago

Shameless plug for another similar system: <a href="http://cwerg.org" rel="nofollow">http://cwerg.org</a> Less ambitious with a focus on (measurable) simplicity.

评论 #42812975 未加载

s3graham4 months ago

This looks pretty cool. I've been looking at all the "small" backends recently. It's so much nicer to work with one of them than trying to wrangle LLVM.QBE, MIR, & IR (php's) are all worth a look too.Personally I've settled on IR for now because it seemed to match my needs the most closely. It's actively developed, has aarch64 in addition to x64 (looks like TB has just started that?), does x64 Windows ABI, and seems to generate decent code quickly.

sylware4 months ago

Again, somebody who comes to the realization something is seriously wrong with ultra-complex languages in the SDK (c++ and similar).In other words, since this alternative LLVM is coded in plain and simple C, it is shielded against those who are still not seeing that computer languages with an ultra complex syntax are not the right way to go if if want sane software.You also have QBE, which with cproc will give you ~70% of latest gcc speed (in my benchmarks on AMD zen2 x86_64).

评论 #42814619 未加载

评论 #42814245 未加载

评论 #42821673 未加载

评论 #42812773 未加载

IshKebab4 months ago

I dunno if "twice as fast as Clang" is very impressive. How fast is it compared to Clang 1.0?Also starting a new project like this in C is an interesting choice.

评论 #42814270 未加载

triilman4 months ago

I not understand about IR or compiler backend but I know another LLVM alternative like QBE. <a href="https://c9x.me/compile/" rel="nofollow">https://c9x.me/compile/</a>

Night_Thastus4 months ago

I'm confused, is this some kind of re-post? I saw this exact same post with the exact same comments on it awhile ago. Could have been more than a couple of weeks. Very strange.

评论 #42815835 未加载

Ygg24 months ago

Does anyone else experience site disappearing on scroll?

elvircrn4 months ago

No benchmarks yet?

bjourne4 months ago

Good stuff. Hope they succeed. LLVM support for jit:ed and gc:ed languages is pretty weak so a competitor that addresses those and other shortcomings would be welcome.

评论 #42815207 未加载

orliesaurus4 months ago

The maintainer said that LLVM has 10M lines of code making it too hard to improve, so he's building its own. That sounds weird to me: but good luck I guess?

ziofill4 months ago

Wow, my brain read “my LLM alternative” and I was genuinely confused for a while when reading the blog post :facepalm:

coolThingsFirst4 months ago

Is it just me or I find it difficult to believe that 19 year olds can implement the LLVM alternative?

评论 #42816566 未加载

评论 #42816416 未加载

评论 #42814449 未加载

评论 #42816814 未加载

评论 #42818618 未加载

评论 #42816112 未加载

tester7564 months ago

You want to create LLVM alternative and you write it in C?I'm saying this as someone who uses LLVM daily and wishes that it was written in anything else than C/CPP,those languages bring so many cons that it is unreal.Slow compilation, mediocre tooling (cmake), terrible error messages, etc, etc.What's the point of starting with tech debt?

评论 #42818792 未加载

评论 #42840376 未加载