This could be pretty cool. Right now you have to drop down to non-portable, assembly-style JIT like Xbyak (<a href="https://github.com/herumi/xbyak" rel="nofollow">https://github.com/herumi/xbyak</a>) or VIXL on ARM (<a href="https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/announcing-vixl-a-dynamic-code-generation-toolkit-for-armv8" rel="nofollow">https://community.arm.com/developer/ip-products/processors/b...</a>). It might not be obvious why you'd want something like that, so I'll explain. Basically think of this as templates which you don't have to pre-instantiate. You can specify exactly how long your loops are, so the compiler will be able to do a much better job of vectorizing, eliminating branches, etc etc. In a tight math-kernel style code this could easily improve the perf by 2x or more. Combined with intrinsics this could eliminate the need for assembly in many cases.<p>This would be exciting if it came to fruition at some point. Unfortunately C++ standard being what it is, we're looking at like 2025 before we see the first standard compliant implementation. Perhaps other languages could take this idea and run with it.
There is cling: <a href="https://root.cern.ch/cling" rel="nofollow">https://root.cern.ch/cling</a><p>You can also use cling as a jupyter C++ kernel as well.
As per example the templates are in compute intensive part of the app — pull that out to a library, compile once and done - no need for JIT in the standard /runtime (did they mention +75MB! runtime?)