科技回声

12 条评论

slx26大约 7 年前

A great reminder that spatial locality can be critical to performance. But that's actually quite well-known, and maybe a more important lesson would be: don't take performance for granted, even in standard libraries of well stablished languages or algorithms, there might be much room for improvement for your use case (if you need it).This said, the article feels quite arrogant. As other commenters say, the idea behind the optimization is nothing new, just some basic concepts well applied. Which is a job well done, but probably doesn't justify the tone (that at least I perceive) from the article.

评论 #16611582 未加载

评论 #16612045 未加载

评论 #16612439 未加载

userbinator大约 7 年前

It's pretty much the same reason why databases use B-trees and not binary trees --- when you have to consider the nonuniform latency of storage access, traditional algorithmic analysis that assumes every memory address is accessible in the same time as any other becomes extremely misleading.

评论 #16611670 未加载

评论 #16611555 未加载

abainbridge大约 7 年前

I agree with his rant about "CS departments' algorithmic analysis" failing to cover these real-world hardware issue. That was certainly true of my ~1997 undergrad CS course at the University of St Andrews, that was otherwise exemplary.The only thing I don't understand is why his data structure for finding the least-recently-used item wasn't just a linked list - where each time you touch an object you move it to one end of the list, and when you want to discard an object you remove from the other end. That'd be o(1) and would cause, at most, one page fault per update and none for the add and remove cases (since the two ends would surely be paged "in"). And it'd be a lot easier than his fancy thing. I guess I missed something.It might be worth pointing out that NVMe or Intel Optane would solve his problem (1) without his fancy data structure too. Some might go as far as to say that while his fancy data structure is out dated now, the stuff that the CS departments teach became correct again. Maybe St Andrews had the right idea all along.1) Or huge-pages or NUMA node pinning, to some extent.

评论 #16612007 未加载

评论 #16611224 未加载

评论 #16611754 未加载

评论 #16611810 未加载

alecco大约 7 年前

A similar but more sophisticated use of this was Intel Research's seminal paper on FAST tree (index). Also from 2010.<a href="http://www.webislands.net/pubs/FAST__SIGMOD10.pdf" rel="nofollow">http://www.webislands.net/pubs/FAST__SIGMOD10.pdf</a>

dang大约 7 年前

Past discussions:<a href="https://news.ycombinator.com/item?id=5586889" rel="nofollow">https://news.ycombinator.com/item?id=5586889</a><a href="https://news.ycombinator.com/item?id=1426211" rel="nofollow">https://news.ycombinator.com/item?id=1426211</a>

frankmcsherry大约 7 年前

If anyone wants to read more, there are (pre this article) papers on cache friendly/oblivious priority queues, e.g.<a href="http://erikdemaine.org/papers/BufferTrees_STOC2002/paper.pdf" rel="nofollow">http://erikdemaine.org/papers/BufferTrees_STOC2002/paper.pdf</a>The rough gist is that if the only thing you need to be able to do is insert and pop-min, then you can buffer up roughly a page's worth of incomplete work for each page, getting amortized performance that tracks the optimal number of cache transfers at each level of the cache.Further, in the paper above you are introduced (in the related work) to the citation:"A. LaMarca and R. E. Ladner. The influence of caches on the performance of heaps. Journal of Experimental Algorithmics, 1, 1996."which describes the author's "B-Heap" (at least, it describes a b-ary heap where you pick b to match your cache line/page size).I'm guessing the problem isn't that academics don't understand these things so much that practicing programmers don't always read very much.

Goopplesoft大约 7 年前

> ..., not to mention wasting enormous amounts of hardware and electricity.I wonder how legitimate this concern is from an environmental angle. On one hand a lot of code in the world probably has much worse performance ratios than than the one described in the article -- using python is sufficient to cause an order of performance loss. On the other hand data centers are using more renewable energy sources and reducing their carbon output.Is this only a wasted money problem? How worth while is it to write "green code"? Back of the envelope calculations on how much developer comfort costs would be interesting.

throwaway080383大约 7 年前

I had a quick question about this part: """ Two details are worth noting:* Once we leave a VM page through the bottom, it is important for performance that both child nodes live in the same VM page, because we are going to compare them both with their parent.* Because of this, the tree fails to expand for one generation every time it enters a new VM page in order to use the first two elements in the page productively. """Is the point just that you want to pack 8 elements into each page, but if the tree expands the first generation of a page, then you end up only being able to pack in 6?

matte_black大约 7 年前

Binary Heap was always my favorite data structure because of how it would improve my A* pathfinding performance. Guess I should have used B-Heap.

fithisux大约 7 年前

Despite the arrogance, I would be more interested if there were some more mathematical oriented explanation and/or a self-contained code sample like for example the TinyQueue javascript implementation (any language is welcome though.

kqr大约 7 年前

Do I understand this correctly: to optimise for virtual memory, do the things you'd do to optimise for the CPU cache except on a larger scale?

评论 #16612114 未加载

daddosi大约 7 年前

There is a joke inthere about economics but i cant think of it right now

评论 #16610863 未加载

12 条评论

slx26大约 7 年前

评论 #16611582 未加载

评论 #16612045 未加载

评论 #16612439 未加载

userbinator大约 7 年前

评论 #16611670 未加载

评论 #16611555 未加载

abainbridge大约 7 年前

评论 #16612007 未加载

评论 #16611224 未加载

评论 #16611754 未加载

评论 #16611810 未加载

alecco大约 7 年前

dang大约 7 年前

frankmcsherry大约 7 年前

Goopplesoft大约 7 年前

throwaway080383大约 7 年前

matte_black大约 7 年前

Binary Heap was always my favorite data structure because of how it would improve my A* pathfinding performance. Guess I should have used B-Heap.

fithisux大约 7 年前

kqr大约 7 年前

Do I understand this correctly: to optimise for virtual memory, do the things you'd do to optimise for the CPU cache except on a larger scale?

评论 #16612114 未加载

daddosi大约 7 年前

There is a joke inthere about economics but i cant think of it right now

评论 #16610863 未加载

B-Heap vs. Binary Heap (2010)

12 条评论

B-Heap vs. Binary Heap (2010)

12 条评论