I'm impressed that there is still room to eke performance improvements out by fiddling with the base data structure, the original author has been working on refining it for quite a while*<p>* <a href="http://t-t-travails.blogspot.com/2008/07/treaps-versus-red-black-trees.html" rel="nofollow">http://t-t-travails.blogspot.com/2008/07/treaps-versus-red-b...</a>, <a href="http://t-t-travails.blogspot.com/2010/04/red-black-trees-revisited.html" rel="nofollow">http://t-t-travails.blogspot.com/2010/04/red-black-trees-rev...</a>
40 points and no comments?<p>Just wanted to say it is great submissions like this that make me come to Hacker News...<p>Thank you for taking the time to write this up and post it here...
YMMV but for my Python web app executing mostly OO Python 2.5.x code, I got a 10% performance increase by using jemalloc compared to the malloc in RHEL 5. It's as simply as LD_PRELOAD=/path/to/libjemalloc.so -- memory usage is also better (millions of objects allocated than eventually released in a long running process ended up with smaller amount of memory used when using jemalloc).
It would be interesting to see a performance comparison with nedmalloc. It uses a trie instead a of redblack-tree for a faster information access.<p><a href="http://www.nedprod.com/programs/portable/nedmalloc/" rel="nofollow">http://www.nedprod.com/programs/portable/nedmalloc/</a><p>Some time ago I evaluated both of them, and they seemed the best available choices. But I never found a comparison study.