TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

LMAX Disruptor – High Performance Inter-Thread Messaging Library

126 pointsby dgudkovover 1 year ago

15 comments

pclmulqdqover 1 year ago
Every time a new generation plays with the LMAX disruptor, it&#x27;s time to remind them that the modes with multiple producers&#x2F;consumers can have really bad tail latency if your application&#x27;s threading is not designed in the intended way.<p>Disruptor and most other data structures that come from trading are designed to run with thread-per-core systems. This means systems where there will be no preemption during a critical section. They can get away with a lot of shenanigans on the concurrency model due to this. If you are using these data structures and have a thread-per-request model, you&#x27;re probably going to have a bad time.
评论 #38316857 未加载
评论 #38317867 未加载
samsquireover 1 year ago
I am working on a C version of the disruptor ringbuffer it is very simple and I need to verify it so it&#x27;s probably not ready for others but it might be interesting. Aligning by 128 bytes has dropped latency and stopped false sharing.<p>I have gotten latencies to 50 nanoseconds and up.<p>disruptor-multi.c(SPMC) and disruptor-multi-producer.c (MPSC) <a href="https:&#x2F;&#x2F;GitHub.com&#x2F;samsquire&#x2F;assembly">https:&#x2F;&#x2F;GitHub.com&#x2F;samsquire&#x2F;assembly</a><p>I am trying to work out how to support multiple producers and multiple readers (MPMC) at low latency that&#x27;s what I&#x27;m literally working on today.<p>The MPSC and SPMC seem to be working at low latencies.<p>I am hoping to apply actor model to the ringbuffer for communication.<p>I&#x27;m also working on nonblocking lock free barrier. This has latency as low as 42 nanoseconds and up.
评论 #38315879 未加载
colandermanover 1 year ago
I had implemented more-or-less this same concurrency scheme for an IPS&#x2F;DDoS prevention box ~10 years ago, running on Tilera architecture. It was fast (batching + separating read &amp; write heads really does help a ton)... but not as fast as Tilera&#x27;s built-in intercore fabric. It had some limitations but was basically a register store&#x2F;load to access and only like 1 or 2 cycles intercore latency.<p>(Aside, generic atomic operation pro-tip: don&#x27;t if you can help it. Load + local modify + store is always faster than atomic modify, if you can make the memory ordering work out. And if you can&#x27;t do away with an atomic modify, batch your updates locally to issue fewer of them at least.)
评论 #38319643 未加载
评论 #38316861 未加载
dangover 1 year ago
Related. Others?<p><i>Disruptor: High performance alternative to bounded queues</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36073710">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36073710</a> - May 2023 (1 comment)<p><i>LMAX Disruptor: High performance method for exchanging data between threads</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=30778042">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=30778042</a> - March 2022 (1 comment)<p><i>The LMAX Architecture</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22369438">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22369438</a> - Feb 2020 (1 comment)<p><i>You could have invented the LMAX Disruptor, if only you were limited enough</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17817254">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17817254</a> - Aug 2018 (29 comments)<p><i>Disruptor: High performance alternative to bounded queues (2011) [pdf]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12054503">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=12054503</a> - July 2016 (27 comments)<p><i>The LMAX Architecture (2011)</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9753044">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9753044</a> - June 2015 (4 comments)<p><i>LMAX Disruptor: High Performance Inter-Thread Messaging Library</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8064846">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8064846</a> - July 2014 (2 comments)<p><i>Serious high-performance and lock-free algorithms (by LMAX devs)</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=4022977">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=4022977</a> - May 2012 (17 comments)<p><i>The LMAX Architecture - 100K TPS at Less than 1ms Latency</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=3173993">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=3173993</a> - Oct 2011 (53 comments)
yafetnover 1 year ago
Semi-related is the Aeron project: <a href="https:&#x2F;&#x2F;github.com&#x2F;real-logic&#x2F;aeron">https:&#x2F;&#x2F;github.com&#x2F;real-logic&#x2F;aeron</a>
vkakuover 1 year ago
I&#x27;ve actually seen this particular library used (and misused and abused). People tried to offload I&#x2F;O and data heavy tasks on it and was a spectacular fail, with multiple threads getting blocked and people having to frequently adjust it&#x27;s buffer size and batch size.<p>One of those things to remember is that Java I&#x2F;O layering (stuff like JPA) is really terrible. And people in my known Java world tend to prefer the abstractions while the people in the trading world try to use GC-less code (unboxed primitives and byte arrays).<p>Unless you have verified your E2E I&#x2F;O to be really fast (possible off heap), you&#x27;re just pushing a few bytes here and there, your latencies are all in check - this library is not for you. Do all that work first, then use this library.
convexstrictlyover 1 year ago
LMAX - How to Do 100K TPS at Less than 1ms Latency: Video<p><a href="https:&#x2F;&#x2F;www.infoq.com&#x2F;presentations&#x2F;LMAX&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.infoq.com&#x2F;presentations&#x2F;LMAX&#x2F;</a>
vinay_ysover 1 year ago
There&#x27;s a whole new generation of engineers for whom this is new news. Enjoy!
评论 #38316163 未加载
mgaunardover 1 year ago
I built trading systems for LMAX exchanges. Their technology seems quite far from the state of the art to me.<p>I didn&#x27;t know they even claimed to attempt being the fastest exchange in the world. They&#x27;re very far from being so and it&#x27;s quite clear that there are architectural decisions in that platform that would prevent that.
评论 #38320630 未加载
评论 #38321044 未加载
pixelmonkeyover 1 year ago
Martin Fowler has a lovely deep-dive blog post on this architecture:<p><a href="https:&#x2F;&#x2F;martinfowler.com&#x2F;articles&#x2F;lmax.html" rel="nofollow noreferrer">https:&#x2F;&#x2F;martinfowler.com&#x2F;articles&#x2F;lmax.html</a><p>It includes lots of diagrams and citations.<p>One term I always loved re: LMAX is “mechanical sympathy.” Covered in this section:<p><a href="https:&#x2F;&#x2F;martinfowler.com&#x2F;articles&#x2F;lmax.html#QueuesAndTheirLackOfMechanicalSympathy" rel="nofollow noreferrer">https:&#x2F;&#x2F;martinfowler.com&#x2F;articles&#x2F;lmax.html#QueuesAndTheirLa...</a>
jojohohanonover 1 year ago
I came across this a few years back when numbly watching the dependencies scroll by during some Java install.<p>“Disruptor is a fairly presumptuous name for a package” I thought. So I looked into it. It fed musings and thought experiments for many walks to and from the T. I love the balance between simplicity and subtlety in the design.<p>If i recall, it was a dependency for log4j, which makes sense for high volume logging.
bob1029over 1 year ago
I love this pattern. There are many problems that fit it quite well once you start thinking in these terms - Intentionally delaying execution over (brief amounts of) time in order to create batching opportunities which leverage the physical hardware&#x27;s unique quirks.<p>Any domain with synchronous&#x2F;serializable semantics can modeled as a single writer, with an MPSC queue in front of it. Things like game worlds, business decision systems, database engines, etc. fit the mold fairly well.<p>The busy-spin strategy can be viewed as a downside, but I can&#x27;t ignore the latency advantages. In my experiments where I have some live &quot;analog&quot; input like a mouse, the busy wait strat feels 100% transparent. I&#x27;ve tested it for hours without it breaking into the millisecond range (on windows 10!). For gaming&#x2F;real-time UI cases, you either want this or yield. Sleep strategies are fine if you can tolerate jitter in the millisecond range.
willtemperleyover 1 year ago
Beware the advertised latency will probably be when using the busy-spin wait strategy which uses a lot of CPU resource.<p>Great library which makes processing concurrent streams incredibly easy.
up2isomorphismover 1 year ago
I never understand the reason open sourcing a trading system, if it works.
评论 #38325971 未加载
vivzkestrelover 1 year ago
stupid question: how to build a trading system? anyone got a starter guide, resources?
评论 #38316195 未加载
评论 #38315740 未加载
评论 #38316157 未加载