I still maintain that the existence of in memory databases has two main sources: scalability bottlenecks in GC, and storage latency falling behind network latency and staying there.<p>If general purpose programming languages could store the data efficiently in main memory, the feature set of in memory databases is not so high that you can’t roll your own incrementally. But your GC times are going to go nuts, and you’ll go off the rails.<p>If the speed of light governed data access, you’d collect your data locally and let the operating system decide which hot paths to keep in memory versus storage.<p>The last time network was faster than disk was the 1980’s, and we got things like process migration systems (Sprite). Those evaporated once the pendulum swung back.
> we can achieve
comparable performance to an in-memory database system for
the cached working set<p>Should I keep reading, or is the title misleading?<p>The abstract seems to say that the system provides memory-comparable performance for data that is... in-memory
Umbra in ClickBench: <a href="https://github.com/ClickHouse/ClickBench/pull/161">https://github.com/ClickHouse/ClickBench/pull/161</a><p>The initial submission didn't reproduce successfully due to a segmentation fault in an attempt to restart it after data loading. But after some changes, it started to work and showed exceptionally good results.
This is a Database System, if you're checking the comments to understand what type of system this is about. The paper appears in <i>10th Annual Conference on Innovative Data Systems Research</i>, and appearing in that context makes it clear.
You can see additional papers from the same group at <a href="https://umbra-db.com/#publications" rel="nofollow">https://umbra-db.com/#publications</a>
Obligatory link to Neumann’s presentation for the CMU DB lecture series<p><a href="https://m.youtube.com/watch?v=pS2_AJNIxzU" rel="nofollow">https://m.youtube.com/watch?v=pS2_AJNIxzU</a>