See also Aeron[1] which is basically a networked superset of the Disruptor, featuring (IIRC) a just-as-fast machine-local implementation. Made by some of the same people as the LMAX Disruptor. It's not <i>quite</i> ready for general use AFAICT, but it's quickly getting there...<p>There's also a really informative presentation[2] by Martin Thompson on some of the ideas that they're using to get really low latency and high throughput.<p>[1] <a href="https://github.com/real-logic/Aeron" rel="nofollow">https://github.com/real-logic/Aeron</a><p>[2] <a href="https://www.infoq.com/presentations/aeron-messaging" rel="nofollow">https://www.infoq.com/presentations/aeron-messaging</a>
I really don't want to put this down. This looks very interesting and is nicely presented. But I have trouble recognizing the main idea(s) behind the "LMAX Disruptor". To me, this all boils down to:<p>"You get better performance if you implement your event queue via a pre-allocated ring buffer instead of (dynamically allocated) linked-lists or arrays."<p>Is it really just that? If so, this is nothing new, but quite common in game programming and related fields. For example, see: <a href="https://fgiesen.wordpress.com/2010/12/14/ring-buffers-and-queues/" rel="nofollow">https://fgiesen.wordpress.com/2010/12/14/ring-buffers-and-qu...</a><p>What am I missing?
ArrayBlockingQueue [1] uses a ring buffer backed by a fixed-size array, too. It is really hard to say where the speed-up comes from without seeing the code, but it is certainly not from using a ring buffer because both of the compared implementations do so. I guess they just were more careful when they designed the auxiliary state and the operation on it, for example ArrayBlockingQueue explicitly tracks the number of elements in the queue with a separate field - instead of deriving it from the insert and remove pointers - and therefore this field is contented by readers and writers. Also preallocating entries for all slots to work around Java's lack of support for arrays of value types certainly removes some overhead from the allocator.<p>EDIT: Just realized that the code is on GitHub - will have a look.<p>[1] <a href="http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/ArrayBlockingQueue.java" rel="nofollow">http://grepcode.com/file/repository.grepcode.com/java/root/j...</a>
That is an old presentation. The most up to date info can be found at <a href="https://github.com/LMAX-Exchange/disruptor/wiki/Introduction" rel="nofollow">https://github.com/LMAX-Exchange/disruptor/wiki/Introduction</a>
There is a podcast at [1] where the author of the disruptor is interviewed.<p>1: <a href="http://www.se-radio.net/2016/04/se-radio-episode-254-mike-barker-on-the-lmax-architecture/" rel="nofollow">http://www.se-radio.net/2016/04/se-radio-episode-254-mike-ba...</a>
Also take a look at:<p><a href="http://chronicle.software/products/chronicle-queue/" rel="nofollow">http://chronicle.software/products/chronicle-queue/</a>
It's been a while, but didn't the performance turn out to be similar to other data structures in multi-socket scenarios because it requires interlocked -type instructions?