Wonder why this is getting posted today in particular?<p>The quick summary here is that this was a clean-house rewrite of Apache Storm done by an internal team at Twitter. As an open source project history refresher, Apache Storm was originally built by a startup called Backtype, and the project was led by Nathan Marz, the technical founder of Backtype. Then, Backtype was acquired by Twitter, and Storm became a major component for large-scale stream processing (of tweets, tweet analytics, and other things) at Twitter.<p>I wrote a summary of the "interesting bits" of Apache Storm here:<p><a href="https://blog.parse.ly/storm/" rel="nofollow">https://blog.parse.ly/storm/</a><p>However, at a certain point, Nathan Marz left Twitter, and a different group of engineers tried to rethink Storm inside Twitter. There was also a lot of work going on around Apache Mesos at the time. Heron is kind of a merger of their "rethinking" of Storm while also making it possible to manage Storm-like Heron clusters using Mesos.<p>But, I don't think Heron really took off. Meanwhile, Storm got very, very stable in the 1.x series, and then had a clean-house rewrite from Clojure to Java in the 2.x series, mainly to improve performance even more. The last stable/major Storm release was in 2020.<p>Storm provides a stream processing programming API, a multi-lang wire protocol, and a cluster management approach. But certain cluster computing problems can probably be better solved at the infrastructure layer today. (For example, Storm was developed before the whole container + docker + k8s focus in cloud ops.) That said, it's still a very powerful system; on my team, we process 75K+ events per second across hundreds of vCPU cores and thousands of Python processes with sub-second latencies by combining Storm and Kafka with our open source Python project, streamparse.<p><a href="https://github.com/Parsely/streamparse" rel="nofollow">https://github.com/Parsely/streamparse</a><p>The core problems Storm solves: modeling data processing as a computation graph; high-speed network communication between threads, processes, and nodes; message delivery guarantees and retry capabilities; tunable parallelism; built-in monitoring and logging; and much more.<p>(Also, I'd be remiss if I didn't mention -- if you're interested in stream processing and distributed computing, we are hiring Python Data Engineers to work on a stack involving Storm, Spark, Kafka, Cassandra, etc.) -- <a href="https://www.parse.ly/careers/python_data_engineer" rel="nofollow">https://www.parse.ly/careers/python_data_engineer</a>