Lock-free B+ trees seem like a natural and good idea.
However, it's hard to evaluate how it compares to previous work, in part because this paper uses completely different terminology than any paper I've ever read on lock-free or wait-free data structures. For starters, it uses 'latch' and 'latch-free' probably a hundred times, in lieu of the ubiquitous'lock' and 'lock-free' . I gather from Google that this is an Oracle thing[1]; they call spinlocks 'latches' and more complicated queueing locks 'enqueues'.<p>It would also be good to know more about the skip list implementation they compared against; their description in VI.A doesn't sound like any concurrent skip list I'm aware of (e.g., Java's java.util.concurrent.ConcurrentSkipListMap). They don't say what all their BW-tree implementation includes, but if it's just the data structure, 10k lines of C++ is an order of magnitude larger than even pretty complex concurrent skip lists.<p>[1] <a href="http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:10899066672589" rel="nofollow">http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTI...</a>
More context:
"Adhering to the “latch-free” philosophy, the Bw-tree delivered far better processor-cache performance than previous efforts.<p>“We had an ‘aha’ moment,” Lomet recalls, “when we realized that a single table that maps page identifiers to page locations would enable both latch-free page updating and log-structured page storage on flash memory. The other highlight, of course, was when we got back performance results that were stunningly good.”<p>The Bw-tree team first demonstrated its work in March 2011 during TechFest 2011, Microsoft Research’s annual showcase of cutting-edge projects. The Bw-tree performance results were dazzling enough to catch the interest of the SQL Server product group.<p>“When they learned about our performance numbers, that was when the Hekaton folks started paying serious attention to us,” researcher Justin Levandoski says. “We ran side-by-side tests of the Bw-tree against another latch-free technology they were using, which was based on ‘skiplists.’ The Bw-tree was faster by several factors. Shortly after that, we began engaging with the Hekaton team, mainly Diaconu and Zwilling.”<p>“A skiplist is often the first choice for implementing an ordered index, either latch-free or latch-based, because it is perceived to be more lightweight than a full B-tree implementation”, says Sudipta Sengupta, senior researcher in the Communication and Storage Systems Group. “An important contribution of our work is in dispelling this myth and establishing that latch-free B-trees can perform way better than latch-free skiplists. The Bw-tree also performs significantly better than a well-known latch-based B-tree implementation—BerkeleyDB—that is widely regarded in the community for its good performance.”<p>[1] <a href="http://research.microsoft.com/en-us/news/features/hekaton-122012.aspx" rel="nofollow">http://research.microsoft.com/en-us/news/features/hekaton-12...</a>
I'm glad that Microsoft Research publishes their studies for free like this instead of having to pony up for it through the IEEE -- this certainly looks intriguing!
I hope this makes its way into ReFS or some other Windows filesystem. A friend who used to work on the NTFS team told me ReFS was B-tree based, which disappointed me as B-Trees are ill suited to SSDs.<p>It was almost like MS completely missed the technology shift due to their glacial release cycles. But maybe I was wrong.
Does anyone have any idea about how this compares to Google's btree?<p><a href="https://code.google.com/p/cpp-btree/" rel="nofollow">https://code.google.com/p/cpp-btree/</a>