What are the lesser known but useful data structures?

347 pointsby mck-over 11 years ago

22 comments

tikhonjover 11 years ago

There's a whole set of interesting data structures that are not very well known: succinct data structures[1]. The idea is simple: we want to store data in a compressed form, but also perform certain operations quickly without uncompressing.These can be very useful for certain applications. The article on "Cramming 80,000 Words into a JavaScript File"[2] is a nice example. It shows you how you can store a compressed trie in memory but still use it. I also like this[3] series of blog posts leading up to wavelet trees.These certainly count as obscure data structures, unlike many of the ones listed on SO. I had never even considered the idea of compressing data in memory like this, much less encountered actual examples of succinct data structures! I have to thank Edward Kmett for introducing me to the whole field.These data structures are important not just because they're neat themselves, but because they got me to think a new way. In particular, I realized that using pointers all over the place--to represent things like trees--is not always efficient. Instead of parsing data, it might be better to store it as a blob of some sort with a binary index. Just starting to consider details like that is valuable all on its own.[1]: <a href="http://en.wikipedia.org/wiki/Succinct_data_structure" rel="nofollow">http://en.wikipedia.org/wiki/Succinct_data_structure</a>[2]: <a href="http://stevehanov.ca/blog/index.php/?id=120" rel="nofollow">http://stevehanov.ca/blog/index.php/?id=120</a>[3]: <a href="http://alexbowe.com/rrr/" rel="nofollow">http://alexbowe.com/rrr/</a> and <a href="http://alexbowe.com/wavelet-trees/" rel="nofollow">http://alexbowe.com/wavelet-trees/</a>

评论 #7079933 未加载

评论 #7080668 未加载

评论 #7080666 未加载

kintamanimattover 11 years ago

> This question exists because it has historical significance, but it is not considered a good, on-topic question for this site, so please do not use it as evidence that you can ask similar questions here.Yet it's one of the best questions on SO. Something's very wrong with SO if this isn't considered a good, on-topic question for a programming Q&A site.

评论 #7079944 未加载

teddyhover 11 years ago

Even though they are included in the GNU C library, most people do not seem to know about Obstacks:An "obstack" is a pool of memory containing a stack of objects. You can create any number of separate obstacks, and then allocate objects in specified obstacks. Within each obstack, the last object allocated must always be the first one freed, but distinct obstacks are independent of each other.Aside from this one constraint of order of freeing, obstacks are totally general: an obstack can contain any number of objects of any size. They are implemented with macros, so allocation is usually very fast as long as the objects are usually small. And the only space overhead per object is the padding needed to start each object on a suitable boundary.<a href="https://www.gnu.org/software/libc/manual/html_node/Obstacks.html" rel="nofollow">https://www.gnu.org/software/libc/manual/html_node/Obstacks....</a>Sure, they’re not very interesting, but the point is that you get them for free in the GNU C standard library.

评论 #7096615 未加载

bazzarghover 11 years ago

It's not highlighting one thing, but Chris Okasaki's book on Purely Functional Data Structures, and this brilliant top answer to a question about functional data structures published since the book will keep you in reading material for a while: <a href="http://cstheory.stackexchange.com/a/1550" rel="nofollow">http://cstheory.stackexchange.com/a/1550</a>(it was all 'lesser known' to me when I started using haskell not so long ago)

评论 #7081888 未加载

batbombover 11 years ago

I use a Hierarchical Triangular Mesh for indexing gamma ray events from the universe. The data is partitioned in the database according to it's HTM id.<a href="http://arxiv.org/pdf/cs/0701164.pdf" rel="nofollow">http://arxiv.org/pdf/cs/0701164.pdf</a>Currently I use this for indexing ~11 billion gamma ray events. Researchers typically supply a region in the sky, a search radius, and some cuts (energy, event quality, etc...)

评论 #7080100 未加载

hyperpapeover 11 years ago

Looking at these lists, I strongly suspect that people upvote based on whether they personally recognize the data structure.It goes against the intent of the original question, but iIt's almost ideally designed to make you feel good--you get the rush of knowledge then nerd sniped as you head to wikipedia.

评论 #7079753 未加载

VexXtremeover 11 years ago

I love how the question was locked because "it is not considered a good, on-topic question for this site". It's crazy. Unless an extremely specific concrete answer can be given, a question immediately gets killed. SO has turned into such a turd of a website.

评论 #7080762 未加载

评论 #7080850 未加载

评论 #7081747 未加载

评论 #7082226 未加载

评论 #7080625 未加载

chasover 11 years ago

I'm happy to see finger trees got mentioned. Finger trees[0] are extremely useful and general data structure that can be used to implement persistent sequences, priority queues, search trees and priority search queues. (Haskell's Data.Sequence[1] uses specialized 2-3 finger trees internally) They can form the basis of all sorts of interesting custom structures by supplying the appropriate monoid[3], but this does make them harder to approach if you are not familiar with the abstractions.[3] A monoid is any structure that has members that can combine associatively. In addition, it must have an element that can combine with any other element and result in the other element. Some examples: (strings, string concatenation, the empty string); (integers, addition, 0); (natural numbers, max, 0); (booleans, and, True); (functions, composition, the identity function). The functional pearl[2] that describes the design of Haskell's diagrams library[4] goes into much more detail if you are interested in their application to programming.[0] <a href="http://apfelmus.nfshost.com/articles/monoid-fingertree.html" rel="nofollow">http://apfelmus.nfshost.com/articles/monoid-fingertree.html</a>[1] <a href="http://hackage.haskell.org/package/containers-0.5.4.0/docs/Data-Sequence.html" rel="nofollow">http://hackage.haskell.org/package/containers-0.5.4.0/docs/D...</a>[2] <a href="http://www.cis.upenn.edu/~byorgey/pub/monoid-pearl.pdf" rel="nofollow">http://www.cis.upenn.edu/~byorgey/pub/monoid-pearl.pdf</a>[4] <a href="http://projects.haskell.org/diagrams/" rel="nofollow">http://projects.haskell.org/diagrams/</a>

评论 #7081924 未加载

jbogganover 11 years ago

Bloom filters and count-min sketch are awesome.<a href="http://en.wikipedia.org/wiki/Bloom_filter" rel="nofollow">http://en.wikipedia.org/wiki/Bloom_filter</a> <a href="http://en.wikipedia.org/wiki/Count%E2%80%93min_sketch" rel="nofollow">http://en.wikipedia.org/wiki/Count%E2%80%93min_sketch</a>

nilknover 11 years ago

I don't really consider tries and bloom filters all that poorly known. These commonly come up in interviews for fresh graduates at Google/Facebook. Zippers, skip lists, ropes, round-robin databases, etc. are more genuinely not known I think.

abcd_fover 11 years ago

XOR linked list: <a href="http://en.wikipedia.org/wiki/XOR_linked_list" rel="nofollow">http://en.wikipedia.org/wiki/XOR_linked_list</a>It's a double-linked list with just one link per node. However, to start traversing it you have to know at least two adjacent nodes.PS. May not be useful per se, but interesting nonetheless.

评论 #7081512 未加载

krisgeeover 11 years ago

I was going to say Trie but it was the first response to the SO thread so I guess it wasn't as little known as I thought.I implemented it because I was making a game that had scrabble elements in it and needed to check ahead to see if the player had a word that could still take letters (a prefix) or if they'd hit a dead end. Fit the whole SOWPODS into a remarkably tiny space with millisecond lookups. Probably my favourite part of the project.

lifthrasiirover 11 years ago

I found the following page in the Concatenative wiki particularly interesting: <a href="http://concatenative.org/wiki/view/Exotic%20Data%20Structures" rel="nofollow">http://concatenative.org/wiki/view/Exotic%20Data%20Structure...</a> (Note that the page itself is not related to the concatenative languages.)

nlover 11 years ago

Bloom Filters: <a href="http://en.wikipedia.org/wiki/Bloom_filter" rel="nofollow">http://en.wikipedia.org/wiki/Bloom_filter</a>Hamming Codes: <a href="http://en.wikipedia.org/wiki/Hamming_code" rel="nofollow">http://en.wikipedia.org/wiki/Hamming_code</a>

ww520over 11 years ago

Extendible hashing is amazing in space utilization while retaining the performance of hashing.

shurcooLover 11 years ago

Does anyone know of an implementation of rope in golang? Something a little more feature complete than <a href="https://github.com/christianvozar/rope" rel="nofollow">https://github.com/christianvozar/rope</a>.

swahover 11 years ago

Let me just drop this video that is on my watchlist: <a href="http://www.youtube.com/watch?v=-sEdiFMntMA&feature=share&list=PLFDnELG9dpVxEpbyL53CYebmLI58qJhlt" rel="nofollow">http://www.youtube.com/watch?v=-sEdiFMntMA&feature=share&lis...</a> (Erik Dermaine is the lecturer)

pnathanover 11 years ago

I recently learned about the spatial index tree family in connection with data mining. I hope to implement a data-mining centric X tree (n-dimensional) solution for a data analytics package I'm writing soon. That family is is how you efficiently handle KNN lookups, afaict.

WWKongover 11 years ago

Me and my friend were pretty serious about creating a new data structure called "drum". A drum is a one way store. You write to it but can't read from it. We put it off till we figured a practical use.

评论 #7080644 未加载

评论 #7083884 未加载

评论 #7080500 未加载

keefeover 11 years ago

imho most DS & Algo are like kihon - the basic principles t'werk just fine, it's about augments and applications.

halfdeadcatover 11 years ago

Cup-a-bits

serge2kover 11 years ago

It's mentioned in the link, but circular/ring buffers.I've been grappling with decoding/playing back an audio stream and wouldn't have gotten it working if I hadn't found out about boosts lockfree ring buffer.

评论 #7083507 未加载