TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Managing Hierarchical Data in MySQL (or really any database)

58 pointsby mattculbrethalmost 16 years ago

7 comments

znbaileyalmost 16 years ago
The author fails to mention another very common approach that does not suffer (as badly) from the nested set approach. This strategy is called "materialized path".<p>In the materialized path strategy, each node/row has its full path in the tree stored in a column and that path is "materialized" or "derived" from a depth-first traversal to that node. In conjunction with the adjacency list approach this could yield much better results in most situations than a nested set model.<p>Taking the tree used in the article, the paths would look like (using '/' for path delimiter but you can use whatever you like):<p><pre><code> electronics electronics/televisions electronics/televisions/tube electronics/televisions/lcd electronics/televisions/plasma electronics/portable electronics electronics/portable electronics/mp3 players electronics/portable electronics/mp3 players/flash electronics/portable electronics/cd players electronics/portable electronics/2 way radios </code></pre> Alternatively it would make more sense to base the materialized path off of an immutable field such as the PKey of the record (in the same order as the paths above, from the article):<p><pre><code> 1 1/2 1/2/4 1/2/5 1/2/6 1/3 1/3/7 1/3/7/10 1/3/8 1/3/9 </code></pre> This has the following advantages:<p><pre><code> * Node insertion/deletion overhead is low / does not require updating other nodes unlike nested set model * Query for subnodes is easy: "select * from tree where path like 'electronics/televisions'". To get only direct subnodes add a parent_id specifier. * Getting the full path for a node is easy: it's built into the node itself * Node depth is easy: count the number of path "segments" </code></pre> Downsides:<p><pre><code> * Moving sub-tree requires updating all sub-node paths </code></pre> Hope this helps someone!<p>Cheers, -Zach
评论 #710116 未加载
评论 #710306 未加载
评论 #710113 未加载
评论 #710327 未加载
rjurneyalmost 16 years ago
Do yourself a HUGE favor, and just use L-Tree for PostgreSQL instead: <a href="http://www.postgresql.org/docs/current/static/ltree.html" rel="nofollow">http://www.postgresql.org/docs/current/static/ltree.html</a>
评论 #709963 未加载
dabeeeensteralmost 16 years ago
This is quite a neat solution, but it has a few detracting points, most notably mass updates when you want to modify the tree.<p>It also seems slightly irrelevant; surely it's better just to read the structure into memory when the application starts and just work on the data structure in memory?
评论 #709886 未加载
jessepalmost 16 years ago
Here's a comparison of two django apps implementing various ways of doing hierarchies: <a href="http://www.qompr.com/charts/63%3Bdjango-hierarchical-tree-data/" rel="nofollow">http://www.qompr.com/charts/63%3Bdjango-hierarchical-tree-da...</a>
nradovalmost 16 years ago
What a mess. IBM DB2, Oracle, and MS SQL Server all support native storage of hierachical data (as XML). MySQL needs to catch up.
leejalmost 16 years ago
good article <a href="http://www.sitepoint.com/article/hierarchical-data-database/" rel="nofollow">http://www.sitepoint.com/article/hierarchical-data-database/</a>
smithjchrisalmost 16 years ago
I've written an implementation of nested sets for at least 3 DB engines over the years (SQL Server, MySQL and PostgreSQL).<p>Some of that article is plainly dangerous and I wouldn't recommend following it without some extensive research elsewhere. SQL for Smarties is the usual reference.<p>Firstly they seem to insist on MyISAM which is just crap. I don't need to cite any references on that.<p>Secondarily, they use LOCK TABLE whilst modifying the data. That basically blocks other writes yet doesn't give any transactional capabilities. Due to the denormalised nature of the nested set implementation, you NEED transactions to ensure operations are ACID compliant or you risk tree corruption.<p>Thirdly, they do not discuss in detail the performance constraints of nested sets i.e. it is purely a read-optimised. When nested sets start growing in node count, the modification operations do not scale as a significant chunk of the tree needs to be renumbered inside a transaction.
评论 #710156 未加载