Hello HN,<p>I’m excited to share Bodo, an open-source compute engine designed for large-scale data processing in native Python. Bodo is powered by an auto-parallelizing JIT compiler and an HPC backend, enabling it to generate highly optimized, parallel binaries (MPI) for Pandas and NumPy code—all without requiring any code rewrites.<p>Our latest benchmark demonstrates 20x to 240x speedup over traditional distributed computing frameworks like Spark, Ray, and Dask (code and details in repo).<p>The inspiration for Bodo came from my background in HPC, when I saw how extremely slow and hard to use Spark was (has gotten better over the years but still not great). Of course, a compiler has its own limitations (e.g. not all Python is compilable), but I think it’s leaps and bounds better.<p>Let me know what you think.