Lack of backpressure is perhaps the most common newbie mistake in distributed systems. I've just seen <i>so</i> many failures because the servers were overwhelmed and the clients were greedy/impatient (didn't react well to anything less than an immediate response even when they were doing everything they could to overwhelm). It should be a primary design consideration for any such system being built anew.<p>> Backpressure Strategies: control, buffer, drop<p>One of my favorite hacks for Gluster (at Facebook) was in this area. We were using bog-standard NFS as the connection protocol, so neither explicit control nor dropping were options. (People who have always had that luxury with their own thrift/grpc/whatever protocols and libraries don't realize how much of a pain it is not to have it.) That left buffering, which led to memory pressure and latency bubbles. What I ended up implementing was <i>implicit</i> control. We'd just stop reading on a particular client's socket for increasing numbers of milliseconds if overall latency was too high and its queue was too long. This leveraged information we already had, and "borrowed" TCP's flow/buffer control as our own. Clients couldn't ignore it; they'd block in send() or somewhere equivalent. It worked really well to smooth out many of the worst latency spikes. If you're ever stuck in a similar situation, I highly recommend this approach.
Backpressure was a recurring topic at Basho. I thought I might find something from one of our engineers online, but instead I found these: a look at one client’s approach for using Riak, including how they implemented their own backpressure in front of it, and Fred Hebert‘s discussion on queues and backpressure.<p>- <a href="https://blog.booking.com/using-riak-as-events-storage-part4.html" rel="nofollow">https://blog.booking.com/using-riak-as-events-storage-part4....</a><p>- <a href="https://ferd.ca/queues-don-t-fix-overload.html" rel="nofollow">https://ferd.ca/queues-don-t-fix-overload.html</a>
Backpressure is a feature of Elixir's GenStage:<p><a href="https://hexdocs.pm/gen_stage/GenStage.html" rel="nofollow">https://hexdocs.pm/gen_stage/GenStage.html</a><p>and these additional abstactions are built on top of it:<p>- <a href="https://github.com/dashbitco/broadway" rel="nofollow">https://github.com/dashbitco/broadway</a><p>- <a href="https://github.com/dashbitco/flow" rel="nofollow">https://github.com/dashbitco/flow</a>
With pipes between threads, the OS is able to suspend the reading thread when the buffer is empty, or suspend the writing thread when the buffer is full, because the OS "owns" the buffer and the threads. With distributed systems connected via Kafka for example, there is no "process" overseeing both ends that can suspend or resume hollistically, so its difficult for producers to know and respond to back pressure.
Interesting. For two reasons. In the many years that I developed software this was known as the producer/consumer problem [1]. A name that is somewhat (to me) more expressive than back-pressure. But that's just my opinion!<p>What's different now, as the article points out, is the growth of micro-service architectures. Effectively creating a lot of tiny producer/consumer problems across the solution space.<p>Sometimes, I'm glad I don't write code anymore, but only sometimes :-)<p>[1] <a href="https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_problem" rel="nofollow">https://en.wikipedia.org/wiki/Producer%E2%80%93consumer_prob...</a>
I have worked on event driven systems which read from queues for my entire career, messaging then Kafka.<p>Surely in most business scenarios letting data back up on queues for a while suffices?<p>In the situations where it doesn’t, add some more capacity.<p>Back pressure is something I’ve never had to deal with even in HFT and electronic trading situations.
i don't see eye to eye with the author.<p>afaik the only time a modern software developer needs to worry about backpressure is if they're using asynchronous or otherwise non-flow-controlled i/o. neither of these cases nor even a discussion of synchronous vs. asynchronous i/o were mentioned.
I think the key insight here is that just autoscaling your service to keep queue length short is not always gonna work, or not always going to be cost effective, and upstream sources should be prepared to reduce the amount of work they send (unless there is a business reason to do otherwise). Making the queue longer or just dropping data that's already accepted into the queue will sometimes break people's assumptions about how your system works. You can adjust queue size from time to time if you need it, but if your service sees increased use over time, the amount of variance in queue size can get a bit crazy.<p>I worked on a system a while back which processed petabytes of data... autoscaling was out of the question (we had a fixed amount of hardware to process the data, and the lead time for getting new hardware was not especially short), and buffering the data would require eye-watering amounts of disk space. We just buffered as much as we could and then returned a "try again later" error code. We made sure that data accepted by the system was processed within a short enough window. We made sure that the cost of submitting a work item was so small that you could really hammer the front-end as much as you wanted, and it would just say "try again later" or accept work items.<p>I think one of the lessons here is that you need to think long and hard about what error conditions you want to expose upstream. The farther you propagate an error condition, the more different ways you can solve it, but the more complicated your clients get. For example, disks sometimes fail. You can decide to not propagate disk failure beyond an individual machine, load up RAID 1 (mirror) in all your file servers, and back everything up. Or you can push disk failures farther up the stack, and recover from failed disks at a higher level, with lower hardware cost but higher implementation complexity. And if you build a bunch of systems assuming that each part is always working, you run very serious risks of cascading failures once one part does inevitably fail.<p>Obviously, small enough systems running on cloud hardware can usually be autoscaled just fine.
> Backpressure Strategies: control, buffer, drop<p>Or can we add autoscale? I encountered this exact problem today as I were looking for a simple to setup and maintain autoscaler solution. The programs feed in the backpressure metrics to monitor, the autoscaler would tell me how many pods/consumers there needs to be, based on naive estimation or prediction by yesterday's traffic.<p>K8s seems to be an over kill and with stateful pods problems, KEDA looks promising, but I wish there's much simpler ones.
I'm quite befuddled by this article, because the perspective is pretty exactly 180° of what I'm used to.<p>I don't (try to) control or manage backpressure. Backpressure controls - or rather, steers - me/my code. Getting backpressure right is massively important and frequently under-considered design aspect.<p>The most significant thing to understand IMHO is that you need to look at backpressure in the context of the larger system. More concretely, you need to ask yourself "is this telling me that another task should run right now"? Both for synchronously driven sources (e.g. blocking write() syscall) as well as asynchronously driven sources (event loop with "can send" flag), you need to make sure you (a) don't block other progress (held locks on DB) and (b) actually pull up other tasks (particularly important for partially-asynchronous code).<p>And this leads straight to the single most impactful design/scaling decision for these systems: is the source maintaining its outbound scheduling at a high enough level? This is best clarified with an example: say you have a key/value store that sends out change notifications per key. Does your application need every single step, or just the most recent value? And if it's the latter: you shouldn't even have a "true" queue! Instead, the code needs to track which keys are pending for which receiver, and whenever the receiver can accept data, it needs to send the <i>most recent</i> data for pending keys. This is the difference between an unbounded queue and a bounded by number of keys queue. The former can fail disastrously, the latter will just start lagging but once it reaches the worst case, it just stays there.<p>> For example, if someone says: “I made this new library with built-in backpressure…” they probably mean it somehow helps you manage backpressure, not that it actually intrinsically is the cause of it. Backpressure is not “desirable” except when it’s inescapable and you want to protect something else from receiving it.<p>… and this really encapsulates my disagreement. Backpressure is absolutely desirable. I want my libraries to provide me with backpressure when there is reason to do so, and do it in a way I can actually "consume" the backpressure!<p>And lastly:<p>> Backpressure Strategies: control, buffer, drop<p>"buffer" and "drop" are in most cases just bugs. And very hard to debug & track down, due to their load dependency.