But why. Unless you need to use low-level map/reduce, just ditch Spark and use <a href="https://github.com/apache/datafusion-ballista">https://github.com/apache/datafusion-ballista</a> directly. It supports Python too.
I've been keeping an eye on these kinds of Spark accelerator libraries for a while now.<p>How does it compare to Blaze[1] and Gluten[2]?<p>I'm interested in running some benchmarks soon against all three for my project to see how they all go.<p>[1] <a href="https://github.com/kwai/blaze">https://github.com/kwai/blaze</a><p>[2] <a href="https://github.com/apache/incubator-gluten">https://github.com/apache/incubator-gluten</a>