I started this project many years ago when I was coming up with ideas for immutable data structures for the PHP data structures extension [1]. I wanted to support access by position in a sorted set, which led to the idea of using binary search trees for both lists and sets. However, I did not expect the scope of this project to increase as much as it did.<p>The more I read about binary search trees, the more I thought about them, and so down the rabbit hole I went.
When you read everything you can find about a topic for several years, eventually you develop a deep enough understanding to compose new ideas from different sources.<p>This project is the result of many rejected ideas and countless experiments.
I tried my best to distill everything I consider important enough to keep, and will continue to develop other ideas as they come along.
I had no intention to implement anything other than weight-balanced strategies to support positional access, but when I read about rank-balanced trees I knew I had to take on the challenge to implement them.<p>I've been in contact with various authors along the way, specifically Bob Tarjan and Salvador Roura, but have otherwise not received any feedback yet.
Implementing all the algorithms was incredibly hard work, by far the hardest work I've ever done, so I hope that others may find value in their presentation.<p>There is still so much work that could be done, but there comes a time when working on something alone begins to yield diminishing returns. I hope to continue this project as an open-source collaborative effort, so please feel free to ask questions or suggest changes, however small.<p>[1] <a href="https://github.com/php-ds/ext-ds">https://github.com/php-ds/ext-ds</a>
This looks like a thorough treatment of the subject. But my intuition is that B-trees are better than BSTs in almost every context. A good B-tree implementation is more cache-friendly, so faster to insert, delete, traverse. In my previous job I used to replace STL map uses with a btree implementation. It was always faster.
No one else has mentioned it, so I'll be the first. I love the typography and layout. Looks like a proper book. I'll definitely be coming back to this for future reference for style and layout
If you like this article then you might also be interested in: <a href="https://github.com/c-blake/bst">https://github.com/c-blake/bst</a><p>It's ANSI C with a bespoke macro generics system not Go, does not cover as many balancing rules, and is not written up as nicely. It does do something only weakly represented in your Go impls, AFAICT - "binary symmetry" - the reflection about left-right intrinsic to most ideas in this area. (Mehlhorn has a book that does this, but that is the only other source I know.) It also has API considerations like seek-edit separation and how to handle in-tree duplicate keys (FIFO/LIFO/etc. as opposed to values which are also collections).<p>Also, Re: B-trees among several comments here - <i>edit heavy</i> B-trees with very large nodes (driven by IO bandwidth-delay products) need some "mini-scalable" structure for their nodes since shifting (on average) half the entries can cost. That mini-scale could be another B-tree or it could be a binary search tree, perhaps adjusted to have 2-byte sized pointers into the little arena that is a node if 64Knode is enough. I once heard Sybase (now Microsoft SQL Server?) used skip lists. Anyway, this may be a remaining use case for binary trees even in the presence of a B-tree.
Some feedback on readability: the article utilizes the term "logarithmic weight-balance" 6 times without explaining what it is (and several more times after). On the 7th mention, it links to a paper: "Frias[40] published top-down logarithmic weight-balance trees in 2005, under advice from Roura." But the referenced paper doesn't mention logarithmic weight-balance even once, I think [40] calls them "Logarithmic binary search trees", and links to "Roura S., 2001, A new method for balancing binary search trees".<p>After reading the last paragraph:<p>"Any current red-black tree implementation in practice can be replaced by a logarithmic weight-balance tree to achieve better balance, better performance, and get positional access as a bonus, all with a simpler algorithm."<p>... it is clear the article is advocating for these, and the title should hence be something like "Advantages of logarithmic weight-balanced Trees over other balanced trees", or something like that, immediately followed by an explanation of what these are, and what other people call them.<p>The repository linked at the top includes a lot of stuff and makes it hard to find a logarithmic weight-balance tree implementation, lost in all other kinds of trees in there. Would be helpful to see a separated logarithmic weight-balanced tree go module for people looking forward studying its source code.<p>40: L. Frias, Extending STL maps using LBSTs, 2005.
If you'd like to get more attention from academics, you could try submitting a journal paper of the work. The ACM Computing Surveys journal may be a good venue: <a href="https://dl.acm.org/journal/csur" rel="nofollow noreferrer">https://dl.acm.org/journal/csur</a>
Skimming through, I didn't see any discussion of AA trees which I have seen discussed as similar to red black trees but simpler to implement and have been interested in exploring for learning purposes. Any reason you didn't look specifically at AA trees, or were they covered somewhere that I missed?