TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Judy, an efficient sparse dynamic array implementation

25 pointsby tillulenover 15 years ago

3 comments

tptacekover 15 years ago
I've always thought Sean Barrett's take on Judy Arrays was a really good exemplar of casual computer science writing:<p><a href="http://nothings.org/computer/judy/" rel="nofollow">http://nothings.org/computer/judy/</a><p><i>There are a few potential disadvantages to keep in mind about the Judy approach. Its author argues that approaching the problem as engineering means accepting that complexity is how you get good performance; simplicity is for theoreticians. But this complexity leads to things like a data structure with 20,000 lines of code designed on the assumption that a cache-line is 64 bytes being used primarily on machines with 32-byte cache lines; a redesign optimized for 32-byte cache lines would come close to starting from scratch.</i><p>...<p><i>If your data is strictly sequential; you should use a regular array. If your data is often sequential, or approximately sequential (e.g. an arithmetic sequence stepping by 64), Judy might be the best data structure to use. If you need to keep space to a minimum--you have a huge number of associative arrays, or you're only storing very small values, Judy is probably a good idea. If you need an sorted iterator, go with Judy. Otherwise, a hash table may be just as effective, possibly faster, and much simpler.</i><p><i>Furthermore, there aren't many ways to improve Judy, but there's a lot of room to improve hash tables</i> ... read the article for the rest of this (very good) graf.<p>Oh, and if it helps his credibility any...<p><a href="http://nothings.org/computer/knuth.jpg" rel="nofollow">http://nothings.org/computer/knuth.jpg</a><p>:)
评论 #860451 未加载
vicayaover 15 years ago
According to Nikolas Askitis' PhD thesis, Judy Array actually underperforms Burst trie (also a sorted collection) by significant margins in both space and time. Burst trie is much easier to implement (a few hundred lines of C++ or Java code) and tune but he has so far refused to release his code for people to independently verify his claims.<p>cf. <a href="http://www.naskitis.com/" rel="nofollow">http://www.naskitis.com/</a>
moonpolysoftover 15 years ago
Judy arrays are really great. I have a local cache implementation in erlang which uses them: <a href="http://github.com/cliffmoon/cherly" rel="nofollow">http://github.com/cliffmoon/cherly</a><p>Even with a small number of items they are relatively fast, however they do not really start outperforming ordinary hash tables until you get above around 100,000 or so items.