Key point: terabyte-sized <i>memory pools</i>. Which is quite awesome.<p>I'd initially thought it referred to terabyte-sized <i>executables</i>. Followed by a "oh great. Now someone's going to <i>make</i> one, and some government's going to <i>want</i> one."
Sometimes I feel as though my little, simple, apps' source will end up being a terabyte after packaging all their dependancies. It's a little ridiculous.<p>Regardless, this seems like it has some pretty powerful implications for big-data processing. The potential integrating this with Clojure somehow, and parallelizing the computation across those 10 (at least) servers with 1TB of memory is pretty astonishing to think about. (Though you don't need Clojure to make it parallel, yes.)<p><a href="http://www.terracotta.org/ehcache-2.2?src=/index.html" rel="nofollow">http://www.terracotta.org/ehcache-2.2?src=/index.html</a> for the product itself.
<i>An organization could put their entire database into memory, which would reduce the latency of the application by "a couple of orders of magnitude," he said.</i><p>That works well until the power goes out (and it does) or the OS (or the JVM) crashes. Keeping the hot portion of the data cached in memory (and maintaining a smarter cache vs. simple LRU heuristics) <i>without</i> sacrificing durability is still a must for data you care about it.<p>You can checkpoint your data to disk and assume you'll never have more data in memory, but that starts to become very expensive when you factor in obsolete versions, replication (to make your system immune to machine failures), logs for recovery.<p>Ultimately there's a lot to be said about the redundance of putting a cache in front of a database. The right thing to do, however, is to build storage systems (that may or may not resemble conventional databases) that integrate caching. I highly suggest reading about LSM trees as used by BigTable (a way to reduce write latency without significantly sacrificing durability) as well as the BigTable paper (for the "keep the hot set in memory, maintain disk persistence" model): ehCache is a useful product, but it's simplistic to say it can replace databases and file systems.
What Azul is doing gets you about 3/4s that far with true SMP. They've got their Vega custom hardware, a new x86-64 software only version named Zing (<a href="http://www.azulsystems.com/products/zing" rel="nofollow">http://www.azulsystems.com/products/zing</a>) and they're pushing an open source version of the foundation (or more) of Zing through the Managed Runtime Initiative (<a href="http://news.ycombinator.com/item?id=1491653" rel="nofollow">http://news.ycombinator.com/item?id=1491653</a> and <a href="http://www.managedruntime.org/" rel="nofollow">http://www.managedruntime.org/</a>).<p>And they're listening to people and continuing to work on the latter, e.g. 3 days ago they updated the Linux source code releases. A complete SRPM, particularly for Fedora Core 12, and a kernel patch suitable for auditing and applying to the newer 2.6.34 containing the memory management half with remaining scheduling part to follow: <a href="http://lists.managedruntime.org/pipermail/dev/2010-July/000004.html" rel="nofollow">http://lists.managedruntime.org/pipermail/dev/2010-July/0000...</a>.