TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

An Introduction to Cache-Oblivious Data Structures

218 点作者 rusbus超过 7 年前

9 条评论

SlySherZ超过 7 年前
For everyone interested in this, I would like to recommend the MIT Advanced Data Structures course, by prof. Erik Demaine (mentioned in the article). The part of the course about cache oblivious data structures starts on this lecture:<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=V3omVLzI0WE" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=V3omVLzI0WE</a>
评论 #16321367 未加载
评论 #16319691 未加载
评论 #16320628 未加载
fulafel超过 7 年前
Trivia: Why they are called &quot;cache-oblivious&quot; and not &quot;cache-aware&quot;?<p>The paper that coined the term is here: <a href="http:&#x2F;&#x2F;cacs.usc.edu&#x2F;education&#x2F;cs653&#x2F;Frigo-CacheOblivious-FOCS99.pdf" rel="nofollow">http:&#x2F;&#x2F;cacs.usc.edu&#x2F;education&#x2F;cs653&#x2F;Frigo-CacheOblivious-FOC...</a><p>Basically, there were first cache-aware algorithms that assumed certain cache sizes and other properties. &quot;Cache-oblivious&quot; algorithms were a refinement that worked well for many cache sizes.<p>All in all it&#x27;s silly that the &quot;cache-oblivious&quot; term was the one that survived, because now cache-unaware and cache-oblivious algorithms mean the opposite things - contradicting the dictionary definition of oblivious.
评论 #16322846 未加载
评论 #16322708 未加载
评论 #16322548 未加载
评论 #16322556 未加载
vvanders超过 7 年前
Another fun one is radix sort[1]. We used to use it for sorting alpha&#x27;d quads(think particle systems&#x2F;ui&#x2F;etc) from front-to-back. Easily annihilates anything else in terms of raw performance for datasets &lt; 10,000 with int&#x2F;f32 keys and it&#x27;s stable(!) for identical values.<p>This talk[2] from Herb Sutter(starting at 22:30) is one of my favorite talks and does a great job of explaining why this is so important(and why dense arrays are still king in many areas).<p>[1] <a href="http:&#x2F;&#x2F;stereopsis.com&#x2F;radix.html" rel="nofollow">http:&#x2F;&#x2F;stereopsis.com&#x2F;radix.html</a><p>[2] <a href="https:&#x2F;&#x2F;channel9.msdn.com&#x2F;Events&#x2F;Build&#x2F;2014&#x2F;2-661" rel="nofollow">https:&#x2F;&#x2F;channel9.msdn.com&#x2F;Events&#x2F;Build&#x2F;2014&#x2F;2-661</a>
joe_the_user超过 7 年前
The article looks great but a doubling of speed actually seems kind of disappointing (What I&#x27;d like to see is would something like &quot;big O&quot; improvement).<p>I have been investigating cache-oblivious data structures for a while. They&#x27;re impressive, complex and a bit difficult to fathom (for a cache-oblivious tree, you have to consider both relational structure created by pointers and the layout of all the pointer in memory - the article seems like it gives a nice overview compared to academic articles I&#x27;ve looked at).<p>The thing is, in most of the large applications I&#x27;ve tried to optimize, you could squeeze several speed doublings out of fairly easy inner-loop rewrites as soon as you started profiling the app. After that, you got more marginal results from optimizing this and that. Consider; if 25% of the time is spent retrieving data, a halving of time to do this results in a 12.5% increase in performance. Is that 12.5% worth the complexity of implementing a cache oblivious structure? Even if you application is a full database, I&#x27;m not sure if the trade-offs are worth it.
评论 #16321852 未加载
评论 #16320926 未加载
评论 #16320397 未加载
1wu超过 7 年前
Neat--I wrote a C++ implementation of a cache-oblivious (implicit) van Emde Boas binary search tree a [long] while back:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;lwu&#x2F;veb-tree&#x2F;blob&#x2F;master&#x2F;veboas.cpp" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lwu&#x2F;veb-tree&#x2F;blob&#x2F;master&#x2F;veboas.cpp</a>
elvinyung超过 7 年前
Dumb question: is it appropriate to describe fractal trees as an &quot;extreme&quot; version of a log-structured merge tree? At a glance (i.e. if you squint really hard), the buffer flushing process in a fractal tree seems kind of similar to the tablet merging process in LSM trees in that they both batch operations to amortize secondary memory access costs, except fractal trees generalize it to all operations.
评论 #16319954 未加载
djhworld超过 7 年前
Really enjoyed this post, thanks.<p>I&#x27;m not sure if I really understood how the cache-oblivious solution works though in this case, I can&#x27;t get an intuition for how the code linked in the github relates to the description in the article.
评论 #16319365 未加载
cryptonector超过 7 年前
Do fractal tree indexes lend themselves to copy-on-write designs? (B-trees do.)<p>EDIT: Also, are fractal tree indexes roughly like COW b-trees?
MichaelMoser123超过 7 年前
Are fractal tree indexes in the public domain, or are they patented?
评论 #16324069 未加载