It's great to remind everyone how Node follows various rules of the Unix Philosophy, and how it is designed to make process spawning/streaming as natural as on the OS.<p>I would prefer it though if the implication wasn't that a failure in Node's design is responsible for the failure of this in-process-memory technique of sorting massive data sets. From the article:<p>"However, as more and more districts began relying on Clever, it quickly became apparent that in-memory joins were a huge bottleneck."<p>Indeed...<p>"Plus, Node.js processes tend to conk out when they reach their 1.7 GB memory limit, a threshold we were starting to get uncomfortably close to."<p>Maybe simply "processes" rather than "Node processes"? -- I don't think this is a Node-only problem.<p>"Once some of the country’s largest districts started using Clever, we realized that loading all of a district’s data into memory at once simply wouldn’t scale."<p>I think this was predictable. Earlier in the article I noticed this line:<p>"We implemented the join logic we needed using a simple in-memory hash join, avoiding premature optimization."<p>The "premature optimization" line is becoming something of a trope. It is not bad engineering to think at least as far as your business model. It sounds like reaching 1/6 of your market led to a system failure. This could (should?) have been anticipated.