Another major caveat to this benchmark is it doesn't include any significant marshalling costs. For example, passing strings or arrays from Java to C is much, much slower than passing a single integer. Same is going to be true for a lot (all?) of the GC'd languages, and especially true for strings when the language isn't utf8 natively (as in, even though Java can store in utf8 internally, it doesn't expose that publicly so JNI doesn't benefit)
Some of the results look outdated. The Dart results look bad (25x slower than C), but looking at the code (<a href="https://github.com/dyu/ffi-overhead/tree/master/dart" rel="nofollow">https://github.com/dyu/ffi-overhead/tree/master/dart</a>) it appears to be five years old. Dart has a new FFI as of Dart 2.5 (2019): <a href="https://medium.com/dartlang/announcing-dart-2-5-super-charged-development-328822024970" rel="nofollow">https://medium.com/dartlang/announcing-dart-2-5-super-charge...</a> I'm curious how the new FFI would fare in these benchmarks.
There is no Python benchmark but you can find a PR claiming it has 123,198ms. That would be a worst one by a wide margin.<p><a href="https://github.com/dyu/ffi-overhead/pull/18" rel="nofollow">https://github.com/dyu/ffi-overhead/pull/18</a>
The D programming language has literally a zero overhead to interface with C. The same calling conventions are used, the types are the same.<p>D can also access C code by simply importing a .c file:<p><pre><code> import foo; // call functions from foo.c
</code></pre>
analogously to how you can `#include "foo.h"` in C++.
I had to run it to believe, I confirm it's 183 seconds(!) for python3 on my laptop<p>Also, OCaml because I was interested (milliseconds):<p><pre><code> ocaml(int,noalloc,native) = 2022
ocaml(int,alloc,native) = 2344
ocaml(int,untagged,native) = 1912
ocaml(int32,noalloc,native) = 1049
ocaml(int32,alloc,native) = 1556
ocaml(int32,boxed,native) = 7544</code></pre>
It seems Rust has basically no overhead versus C, but it could have <i>negative</i> overhead if you use cross-language LTO. Of course, you can do LTO between C files too, so that would be unfair. But I think this sets it apart from languages that, even with a highly optimised FFI, don't have compiler support for LTO with C code.
Just a caveat, not sure if it matters in practice, but this benchmark is using very old versions of many languages it's comparing (5 year old ones).
I developed a terminal emulator, file manager and text editor Deodar 8 years ago in JavaScript/V8 with native C++ calls, it worked but I was extremely disappointed by speed, it felt so slow like you need to do a passport control each time you call a C++ function.
This ia a cool concept, but the implementation is contrived (as many others describe). e.g. JNI array marshalling/unmarshalling has a lot of overhead. The Nim version is super outdated too (not sure about the other languages).
For a game scripting language, Wren posts a pretty bad result here. Think it has isn't explicitly game focused though. The version tested is quite old however, having released in 2016.