TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Using Uninitialized Memory for Fun and Profit

85 pointsby l0stmanabout 15 years ago

4 comments

chmikeabout 15 years ago
That data structure can be actually used for a log structured database where the sparse vector is the log file and the dense vector is the record index.<p>The index of the dense vector is then the record identifier, and the dense vector contains the offset of the valid record in the log database.<p>The record identifier is a handle to the record data. It can be reused when the record is deleted but will remain the same for the lifetime of the record.<p>The dense vector is very compact and may be stored and retrieved efficiently, even if stored in the log itself. The record key index stores the association between the key and the record identifier.<p>The GC or crash recovery process can easily locate valid records by checking if their offset matches the one found in the dense index.<p>So this data structure is not just of historical interest. Thanks for the link.
antirezabout 15 years ago
An interesting implementation of this trick can be found in the LZF compression lib: <a href="http://oldhome.schmorp.de/marc/liblzf.html" rel="nofollow">http://oldhome.schmorp.de/marc/liblzf.html</a><p>Basically the hash table used to compare the current input with already seen input is uninitialized, since the two values are anyway compared bit-per-bit to actually make sure there is a match.
jrockwayabout 15 years ago
malloc + memset takes about .24 seconds per gig on my machine. So be sure that your dataset is really, really sparse before you try this.
评论 #1156842 未加载
评论 #1156723 未加载
评论 #1156943 未加载
barrkelabout 15 years ago
Liveness analysis - the motivating example - is typically in the form of a set with one element per local variable. Most practical routines will have less than 32 or 64 variables, however, so even in practice a 32-bit or 64-bit word will be better. Also, you often have to compute other set operations, such as union, intersection and difference, rather than just membership and iteration.
评论 #1174172 未加载