Load Balancing is Impossible [video]

193 pointsby thepumpkin1979over 8 years ago

16 comments

jonduboisover 8 years ago

It seems that the speaker is discussing small servers; 150 concurrent requests per server/process sounds very small to me (but maybe this is common in enterprise?); in this case, I can see why the variance introduced by the random algorithm is such a problem.But if you have an efficient system where each server can handle 10K+ concurrent requests, then the variance introduced by the random algorithm becomes insignificant. Also, it doesn't matter that some requests take much longer than others; if you're dealing with a lot of users per server, the slow requests will be distributed evenly too and average out.I implemented the balls-and-bins algorithm and I fond that when dealing with few buckets and a large number of balls, the variance in load between buckets became smaller. I even added randomness to the 'weight' of each request (to account for slow vs fast requests) and the distribution was still even - The more requests each server can handle, the more even the distribution is with the random algorithm.

评论 #13137983 未加载

devyover 8 years ago

One of the approaches and the last Tyler, the speaker, introduced in the talk to improve the fat tail distribution of load balancing latency problem was Join-Idle-Queue algorithm[1] that came out of UIUC Extreme Computing Group, Microsoft Research and Azure in 2011.And according to Tyler's data (fast.ly CDN data since he's the CTO there), it beats all the current load balancing algorithms including Join-Shortest-Queue. Six years later, I wonder:a) was there any other novel / better algorithms tops JIQ in LB performance?b) has Microsoft Azure LB uses JIQ internally in their LB offerings?c) has any open source LB software implement JIQ algorithm on the horizon? (according to Tyler, he found none.)Can someone with expertise share some lights on these? Thanks.[1]: <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2011/10/idleq.pdf" rel="nofollow">https://www.microsoft.com/en-us/research/wp-content/uploads/...</a>

评论 #13140696 未加载

评论 #13140090 未加载

评论 #13140787 未加载

acqqover 8 years ago

The slides are here:<a href="http://www.slideshare.net/Fastly/load-balancing-is-impossible-62833005" rel="nofollow">http://www.slideshare.net/Fastly/load-balancing-is-impossibl...</a>Wrap up and Q&A:<a href="https://www.youtube.com/watch?v=gas2v1emubU&t=30m24s" rel="nofollow">https://www.youtube.com/watch?v=gas2v1emubU&t=30m24s</a>The article from the slide 23:<a href="http://latencytipoftheday.blogspot.co.at/2014/06/latencytipoftheday-most-page-loads.html" rel="nofollow">http://latencytipoftheday.blogspot.co.at/2014/06/latencytipo...</a>"#LatencyTipOfTheDay: MOST page loads will experience the 99%'lie server response"

评论 #13137957 未加载

评论 #13141634 未加载

评论 #13137958 未加载

thomasahleover 8 years ago

Very nice. I was expecting this to ignore all the research on things like power of two choices and similar provable techniques, but the talk is actually really thorough in its research.It's very interesting to see what experiences people have with using all of this in practice. The title seems a bit clickbaity though.

spydumover 8 years ago

Good talk, I feel like I've tread exactly this same thinking and given a similar talk internally (i accidentally ended up as the LB SME), even used poisson distribution to model the requests and do some code demos.He references a simple algorithm/paper on simply chosing between two servers, and the results on the smoothing out of variance was impressive (paper: <a href="https://www.eecs.harvard.edu/~michaelm/postscripts/mythesis.pdf" rel="nofollow">https://www.eecs.harvard.edu/~michaelm/postscripts/mythesis....</a> ).So, no real conclusion, it's just interesting to peel back the layers of LB.. so many people treat it as magic, when really its a statistical lie - and impossible to achieve perfect distribution (hence the title)!

toast0over 8 years ago

Round robin is somewhat better than random here, because you get more uniform distribution of requests to servers; but you do have to keep a counter and if you have multiple load balancers, with few requests, there's a chance that they'll get synchronized and you'll get bursts of traffic to one server, then the next, etc.You still end up with uneven load if the requests are not equal in load generated (essentially always), but trying to be smart (choosing the least loaded/queued server) doesn't really work (too much work on load balancer, too much latency between request picking and load measuring, etc).

Pica_soOover 8 years ago

Imagine a beach. A beach is made of particles of different sizes, aka length. Now physics apply to this beach, as a line of "processors" that grab one particle out of the array and release it to the future beach, taking the time of length of the particle. There can be emptiness between particles.First property of this, is "entropy" that the beach is self-sorting interaction by length, as time taken to process if close to a different sized particle.Second property is the "light-speed" of this physical simulation. This is the maximum particle length multiplied with the number of processors.Third property is causality, causality is the range of the furthest out particle reached by the light-speed length again multiplied by the light-speed. Obviously causality can be reduced by reducing light-speed.Fourth property is non-interaction, if you add traversable, unmoving particles into the scheduler queues, this warps "time" as in, large particles are forced to jump over them, while small particles take free particle-size time to traverse, and as being non-interacting, are invisible to the huge particles.Sixth property is determinism. The whole physic-simulation is precise and from a deterministic input, we can derive a deterministic output, as long as causality remains unviolated.Now for scheduling, the data-physics are rather simplistic, as the processor-front, or "time" is not repeatedly interacting with the same beach.We also can determinate ahead of time, which processor interacts with what particle, assuming light-speed is not altered. Also note, that big particle being processed equal reduced light-speed.Now what is optimal scheduling? Optimized for throughput, as in shoving huge particles to dedicated processors, with huge fields of non-interacting particles? Fair scheduling, as in having all particles having average particle-size time going through? Prevent a causality buildup over lights-peed?PS: I once wrote a little visualization of these data-physics, and showed them around algorithm class- nobody ever beside the prof cared for it. Nice to meet people who are fascinated by mind games.PS^2: You can use similar property's to compress data, via Conway's game of life into deterministic lock-step simulations. Never saw that used in the wild, as you need either pre-computated simulations to be fast, or some sort of relational operator on some simple pre-set of simple simulations.PS^3: Sorry, got carried away. This is fascinating. Should upload the simulation to GitHub. Excuse any typos.

评论 #13151191 未加载

sisciaover 8 years ago

Great talk, however it is not explained the relationship between the Poisson process and the log-normal distribution.It actually skip that explanation.

RangerScienceover 8 years ago

Huh. It occurs to me that your brain needs to deal with a poisson process for incoming information. How does it handle it?

评论 #13138103 未加载

sjtrnyover 8 years ago

Why not just use a PID controller to load balance?

评论 #13138052 未加载

jest7325over 8 years ago

Interesting talk and I am wondering if the op tried load-balanced algorithm like "leastresponsetime" and "leastconnection"? Today LB are pretty advanced and sometimes are even not even used in technology like ecmp.

bogomipzover 8 years ago

In my experience given talks given by Fastly are always worth seeing or attending.If have the opportunity to attend a talk given by the founder Artur Bergman don't miss it.It will be as informative as it is funny and entertaining. There are many on youtube from conferences past.

graycatover 8 years ago

For the case of load balancing as in the OP, for a statement of the problem, over time, requests for work arrive at an Internet computer server farm. The requests differ in what computing resources at the server farm are needed and for each resource how much work. E.g., a request may need 10 GB of main memory and 40 GB of virtual memory paged memory, 3 TB of I/O to rotating disk, and four processor cores, all for 50 minutes.So, at each arrival, have to assign the work to a server. When the assignment is made, we do know about about the, say, state of each of the servers, say, what work has been assigned and how busy the server is.The goal of the work is in some sense to get the best performance from the server farm for the loads. There are various candidate definitions/measures of performance, e.g., minimize the average elapsed time for the work of the requests to be done, that is, minimize the average response time as seen by the users/customers.So, the work of the load balancing is just the assignments, one work request at a time, to the servers. This assignment is under uncertainty maybe about the details of each request and, so that can plan ahead, also about what the next requests will be.E.g., if we had some idea that soon we would get a request that would need all of one server, then maybe we should think ahead and have reserved an empty server for that request. If we don't reserve such a server, then to assign this request we might have to keep waiting until have an empty server and, unless make an effort to have such an empty server, the request could go unassigned for a long time thus hurting the goal.So, we have to make decisions over time under uncertainty to achieve a goal that is an average of the results.Now likely we have a problem in the field of applied math called stochastic optimal control. In our case the control is the freedom we have when we make the assignments. Stochastic means varying with uncertainty over time. The stochastic part is the uncertainty of the next requests that arrive. The optimal part is that we want to do best as possible likely on some average measure, only average because we are working with uncertainty. Sure, if we don't like average, we could use median, etc.IIRC there was some work on this problem by some especially diligent researchers at the IBM Watson lab and based in part on work of H. Kushner at the Division of Applied Mathematics at Brown University.IIRC the good news is that those researchers made good progress. IIRC the bad news was that the computations for their solution were a bit too large to be practical! Also, from the references I saw them using, the math prerequisites for their work were a bit much!So, maybe good work for a solution is in the literature but too difficult to use. Maybe.My suggestion: Be practical. First cut, assign the next request for work to the least busy server. If the server farm is so busy that often all the servers are too busy for more work, then have to leave requests in a FIFO queue waiting until some server is not too busy for the first request in the queue. Collect empirical, production data on this control, the FIFO queue, the response times, any problems etc. If see no significant problems, then declare the problem solved for now. Else focus on where the problems are and look for a control that also handles those problems.This material about stochastic optimal control was heavily from R. Bellman. Other authors include W. Fleming, D. Bertsekas, S. Shreve, E. Dynkin, R. Rockafellar, R. Wets. The field has been of interest by parts of applied math (e.g., Brown), operations research, and systems and electronic engineering.This load balancing at a server farm contains as a special case some of the topics in queuing theory.Computer science has long been interested in the problem of job scheduling on batch computer systems which, of course, is a relatively small special case of load balancing as in the OP. IIRC the usual result of the job scheduling work was that the problem was in NP-complete and otherwise in practice more work to do than the work being scheduled! Maybe now computer science will also be interested in something stochastic optimal control for load balancing.

abcdevover 8 years ago

Great presentation!

RunawayGalaxyover 8 years ago

Pretty sure that's a donkey.

barrystaesover 8 years ago

SFW cause no stupid meme slides. So.. whats the conclusion of the video i did not hear?