The author fails to mention another very common approach that does not suffer (as badly) from the nested set approach. This strategy is called "materialized path".<p>In the materialized path strategy, each node/row has its full path in the tree stored in a column and that path is "materialized" or "derived" from a depth-first traversal to that node. In conjunction with the adjacency list approach this could yield much better results in most situations than a nested set model.<p>Taking the tree used in the article, the paths would look like (using '/' for path delimiter but you can use whatever you like):<p><pre><code> electronics
electronics/televisions
electronics/televisions/tube
electronics/televisions/lcd
electronics/televisions/plasma
electronics/portable electronics
electronics/portable electronics/mp3 players
electronics/portable electronics/mp3 players/flash
electronics/portable electronics/cd players
electronics/portable electronics/2 way radios
</code></pre>
Alternatively it would make more sense to base the materialized path off of an immutable field such as the PKey of the record (in the same order as the paths above, from the article):<p><pre><code> 1
1/2
1/2/4
1/2/5
1/2/6
1/3
1/3/7
1/3/7/10
1/3/8
1/3/9
</code></pre>
This has the following advantages:<p><pre><code> * Node insertion/deletion overhead is low / does not require updating other nodes unlike nested set model
* Query for subnodes is easy: "select * from tree where path like 'electronics/televisions'". To get only direct subnodes add a parent_id specifier.
* Getting the full path for a node is easy: it's built into the node itself
* Node depth is easy: count the number of path "segments"
</code></pre>
Downsides:<p><pre><code> * Moving sub-tree requires updating all sub-node paths
</code></pre>
Hope this helps someone!<p>Cheers,
-Zach
Do yourself a HUGE favor, and just use L-Tree for PostgreSQL instead: <a href="http://www.postgresql.org/docs/current/static/ltree.html" rel="nofollow">http://www.postgresql.org/docs/current/static/ltree.html</a>
This is quite a neat solution, but it has a few detracting points, most notably mass updates when you want to modify the tree.<p>It also seems slightly irrelevant; surely it's better just to read the structure into memory when the application starts and just work on the data structure in memory?
Here's a comparison of two django apps implementing various ways of doing hierarchies: <a href="http://www.qompr.com/charts/63%3Bdjango-hierarchical-tree-data/" rel="nofollow">http://www.qompr.com/charts/63%3Bdjango-hierarchical-tree-da...</a>
good article <a href="http://www.sitepoint.com/article/hierarchical-data-database/" rel="nofollow">http://www.sitepoint.com/article/hierarchical-data-database/</a>
I've written an implementation of nested sets for at least 3 DB engines over the years (SQL Server, MySQL and PostgreSQL).<p>Some of that article is plainly dangerous and I wouldn't recommend following it without some extensive research elsewhere. SQL for Smarties is the usual reference.<p>Firstly they seem to insist on MyISAM which is just crap. I don't need to cite any references on that.<p>Secondarily, they use LOCK TABLE whilst modifying the data. That basically blocks other writes yet doesn't give any transactional capabilities. Due to the denormalised nature of the nested set implementation, you NEED transactions to ensure operations are ACID compliant or you risk tree corruption.<p>Thirdly, they do not discuss in detail the performance constraints of nested sets i.e. it is purely a read-optimised. When nested sets start growing in node count, the modification operations do not scale as a significant chunk of the tree needs to be renumbered inside a transaction.