TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

How Google Sparsehash achieves two bits of overhead per entry using sparsetable

54 pointsby Smerityalmost 10 years ago

5 comments

eutecticalmost 10 years ago
Note that the maximum fill factor of hashtables can be increased well above 90% with comparable or better performance to standard open addressing using Robin Hood collision resolution with backwards shift deletion. The hashtable in the Rust standard library works like this and there is an argument to be made that it should become standard.<p>Of course, this doesn&#x27;t do away with the need to resize, so it might not be appropriate for low latency or memory constrained systems. I wonder whether a tree-based data structure like a crit-bit tree or a HAMT might be able to do similar things to sparsehash with better performance.
评论 #9877338 未加载
评论 #9877214 未加载
im3w1lalmost 10 years ago
tldr: Open addressing. Group adjacent slots into buckets. Every bucket consists of an occupancy bitmap (1 bit overhead per element), and a pointer (ptrsize &#x2F; nElementsPerBucket bits overhead, per element) to the elements in the bucket.
评论 #9877510 未加载
dspeyeralmost 10 years ago
So we want to insert a value. We find an empty slot. We find the bucket it&#x27;s in. Do we have to realloc the bucket and move all downstream data down by one? That sounds slow.
评论 #9879382 未加载
TheLoneWolflingalmost 10 years ago
Any idea how much space overhead malloc has with entries as small as this? (&lt;1 to 4&gt; * 8 bytes)<p>Seems to me it could easily be a significant amount of overhead.
评论 #9878095 未加载
评论 #9877891 未加载
signa11almost 10 years ago
in case folks are interested in this topic, i would heartily recommend Mihai Pătraşcu blog for some theory :)
评论 #9877230 未加载