科技回声

11 条评论

bts超过 1 年前

FWIW there is prior art here. e.g. see IntMap in Haskell: <a href="https://hackage.haskell.org/package/containers-0.7/docs/Data-IntMap-Strict.html" rel="nofollow">https://hackage.haskell.org/package/containers-0.7/docs/Data...</a>

评论 #39127312 未加载

评论 #39126161 未加载

jemfinch超过 1 年前

This really doesn't seem to be comparing to comparable data structures. For int map specializations like this, the optimized alternatives are things like Judy (which is looking quite aged these days) or roaring bitmaps, not to mention that any C++ developer using "ordinary" maps will be using absl's SwissTable (flat_hash_map) or folly's F14 (F14FastMap) or perhaps absl::btree_map if order is important. Comparisons to std::map and std::unordered_map are simply too naive to make the case for this data structure.

评论 #39127289 未加载

lichtenberger超过 1 年前

We're using a similar trie structure as the main document (node) index in SirixDB[1]. Lately, I got some inspiration for different page-sizes based on the ART and HAMT basically for the rightmost inner pages (as the node-IDs are generated by a simple sequence generator and thus also all inner pages (we call them IndirectPage) except for the rightmost are fully occupied (the tree height is adapted dynamically depending on the size of the stored data. Currently, always 1024 references are stored to indirect child pages, but I'll experiment with smaller sized, as the inner nodes are simply copied for each new revision, whereas the leaf pages storing the actual data are versioned themselfes with a novel sliding snapshot algorithm.You can simply compute from a unique nodeId each data is assigned (64bit) the page and reference to traverse on each level in the trie through some bit shifting.[1] <a href="https://github.com/sirixdb/sirix">https://github.com/sirixdb/sirix</a>

NWoodsman超过 1 年前

Also will throw in to the mix, in C#:<a href="https://julesjacobs.com/2014/11/11/immutable-vectors-csharp.html" rel="nofollow">https://julesjacobs.com/2014/11/11/immutable-vectors-csharp....</a>His implementation uses buffers of capacity 32, generics, and bit shifting to do lookups.

winrid超过 1 年前

Neat, thank you! I'd love to see how it compares to the libgdx IntMap[0].[0] <a href="https://github.com/libgdx/libgdx/blob/master/gdx/src/com/badlogic/gdx/utils/IntMap.java">https://github.com/libgdx/libgdx/blob/master/gdx/src/com/bad...</a>

评论 #39125543 未加载

AaronFriel超过 1 年前

Interesting! Reminds me a great deal of Judy Arrays: <a href="https://en.m.wikipedia.org/wiki/Judy_array" rel="nofollow">https://en.m.wikipedia.org/wiki/Judy_array</a>Judy Arrays are a radix trie with branching and a few node types designed to be cache line width optimized.

repsilat超过 1 年前

Looks rad, I was going to look into some b-trees for a use-case where I need an ordered map of things similar to integers and this might be better.I couldn't immediately see, is there mention of whether insertions invalidate iterators? Maybe not strictly needed for my use-case but good to know.

ww520超过 1 年前

This looks very good. The idea of using a subnet-mask style to compute the prefix of a node is pretty novel. I haven't seen anything like it. The choice of span factor of 16 is a good compromise between node size and tree depth. The node slot packing is amazing. Actually if you relax the restriction on 64-byte node to 128-byte node, you can get 64 bits per slot and will get a much higher limit for the item count. Newer CPU's are starting to support 128-byte cache line.

评论 #39127529 未加载

ursusmaritimus超过 1 年前

Interesting, but the summary does not mention an important fact: the data structure can contain at most 67108864 items, which is a quite low limit.

评论 #39127180 未加载

评论 #39130499 未加载

JonChesterfield超过 1 年前

Appears to be a 16 way branching trie which completely misses both advantages of tree structures over hashes:1/ This tree is mutable, insert doesn't give you a new tree via path copying2/ union/intersection style operations can be sublinear. None of the batch operations are implemented

notfed超过 1 年前

It'd be nice to include djb's crit-bit tree implementation (linked to from the intro) in the benchmarks ("Performance Test" section).

11 条评论

bts超过 1 年前

评论 #39127312 未加载

评论 #39126161 未加载

jemfinch超过 1 年前

评论 #39127289 未加载

lichtenberger超过 1 年前

NWoodsman超过 1 年前

winrid超过 1 年前

评论 #39125543 未加载

AaronFriel超过 1 年前

repsilat超过 1 年前

ww520超过 1 年前

评论 #39127529 未加载

ursusmaritimus超过 1 年前

Interesting, but the summary does not mention an important fact: the data structure can contain at most 67108864 items, which is a quite low limit.

评论 #39127180 未加载

评论 #39130499 未加载

JonChesterfield超过 1 年前

notfed超过 1 年前

It'd be nice to include djb's crit-bit tree implementation (linked to from the intro) in the benchmarks ("Performance Test" section).

Show HN: Integer Map Data Structure

11 条评论

Show HN: Integer Map Data Structure

11 条评论