Interesting, first time I heard about sljit.<p>> Although sljit does not support higher level features such as automatic register allocation<p>I don't quite see how it can be architecture independent if it doesn't do register allocation. Does it use a small fixed amount of virtual registers which work on every target? Or does it spill virtual registers to memory if required?<p>> The key design principle of sljit is that it does not try to be smarter than the developer.<p>> This principle is achieved by providing control over the generated machine code like assembly languages.<p>So it sounds like this is essentially your LLVM backend, taking care of going from intermediate representation to machine code.<p>Optimisations have to be done separately.<p>I see how a lightweight code generator could be quite useful, is sljit used in any larger projects?
From the website:<p><pre><code> The engine strikes a good balance between performance and
maintainability. The LIR code can be compiled to many CPU
architectures, and the performance of the generated code
is very close to code written in assembly languages.
Although sljit does not support higher level features
such as automatic register allocation, it can be a code
generator backend for other JIT compiler libraries.
Developing these intermediate libraries takes far
less time, because they only needs to support a single
backend.
</code></pre>
<a href="https://zherczeg.github.io/sljit/" rel="nofollow">https://zherczeg.github.io/sljit/</a><p>I'd love to see some examples of other projects incorporating this library.
This is interesting:<p><a href="https://github.com/zherczeg/sljit/blob/master/sljit_src/sljitLir.h">https://github.com/zherczeg/sljit/blob/master/sljit_src/slji...</a><p>(Lines 167-205):<p>// Scratch registers.<p>#define SLJIT_R0 1<p>#define SLJIT_R1 2<p>#define SLJIT_R2 3<p>// Note: on x86-32, R3 - R6 (same as S3 - S6) are emulated (they are allocated on the stack). These registers are called virtual and cannot be used for memory addressing (cannot be part of any SLJIT_MEM1, SLJIT_MEM2 construct). There is no such limitation on other CPUs. See sljit_get_register_index().<p>#define SLJIT_R3 4<p>[...]<p>#define SLJIT_R9 10<p>[...]<p>/* Saved registers. */<p>#define SLJIT_S0 (SLJIT_NUMBER_OF_REGISTERS)<p>#define SLJIT_S1 (SLJIT_NUMBER_OF_REGISTERS - 1)<p>#define SLJIT_S2 (SLJIT_NUMBER_OF_REGISTERS - 2)<p>// Note: on x86-32, S3 - S6 (same as R3 - R6) are emulated (they are allocated on the stack). These registers are called virtual and cannot be used for memory addressing (cannot be part of any SLJIT_MEM1, SLJIT_MEM2 construct). There is no such limitation on other CPUs. See sljit_get_register_index().<p>#define SLJIT_S3 (SLJIT_NUMBER_OF_REGISTERS - 3)<p>[...]<p>#define SLJIT_S9 (SLJIT_NUMBER_OF_REGISTERS - 9)<p>Anyway, this is a cool technique...<p>Emulating additional registers on machines that don't have them via the Stack, so that the assembly-like LIR can run on those machines too!<p>Very nice!
Although I only looked at the code briefly I suspect it will very hard to get good performance from the API as provided[1].<p>It looks like you have to do a function call for every high level assembly instruction which in turn does quite a bit of work. See `emit_x86_instruction`[2], most of which is redundant and most of which can be done ahead of time. To JIT quickly you want to work with templates if at all possible. Precompile those templates for the relevant architecture. Then at runtime you just patch the precompiled machine code with the correct addresses and registers. This extra speed really matters because if JITing code is cheap you can compile functions multiple times, inline aggressively and do many other optimizations that wouldn't be economical otherwise.<p>[1] <a href="https://github.com/zherczeg/sljit/blob/master/test_src/sljitTestCall.h">https://github.com/zherczeg/sljit/blob/master/test_src/sljit...</a><p>[2]: <a href="https://github.com/zherczeg/sljit/blob/master/sljit_src/sljitNativeX86_64.c">https://github.com/zherczeg/sljit/blob/master/sljit_src/slji...</a>
Since this is stackless, does it support checkpointing and resuming the execution state? This is sometimes a reason for making execution stackless, but I guess the JIT might make this more difficult. I can't find any mention of this in the readme or project page so I'm guessing no, but it would be neat.
Interesting project and kind of tangential topic, will JIT compilers still be widely adopted given they are considered a critical attack vector when it misbehaves? I wonder if there is an effort to formally verify its safety or do a complete redesign to ensure it.