The method is called "Power of Two <i>Random Choices</i>" (<a href="http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf" rel="nofollow">http://www.eecs.harvard.edu/~michaelm/postscripts/handbook20...</a>). And the two-choices paradigm is widely applicable beyond load balancing. In particular, it applies to hash table design (e.g. cuckoo hashing) and cache eviction schemes (<a href="https://danluu.com/2choices-eviction/" rel="nofollow">https://danluu.com/2choices-eviction/</a>).
I'm not an expert in this field, but an engineer of vimeo went into detail, why this approach did not work for them. [1]<p>Problem with consistent hashing:<p><pre><code> However, consistent hashing comes with its own problem: uneven distribution of requests.
Because of its mathematical properties, consistent hashing only balances loads about as
well as choosing a random server for each request, when the distribution of requests is
equal. But if some content is much more popular than others (as usual for the internet),
it can be worse than that.
</code></pre>
Problem with Power of 2 Load Balancing:<p><pre><code> Why wasn’t there a way to say “use consistent hashing, but please don’t overload any
servers”? As early as August 2015, I had tried to come up with an algorithm based on
the power of two random choices that would do just that, but a bit of simulation said
that it didn’t work. Too many requests were sent to non-ideal servers to be worthwhile.
</code></pre>
Instead, he used something called <i>Consistent Hashing with Bounded Loads</i>.<p>[1] <a href="https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed" rel="nofollow">https://medium.com/vimeo-engineering-blog/improving-load-bal...</a>
The simplest load balancing I've done is modulo the user ID by the number of servers then point at that server.<p>This solves caching too since you are only ever receiving and caching user data on a single server. No cache communication required. You can enforce it on the server side for security as well.<p>Doesn't require a load balance server - just an extra line of code.<p>Keep it simple.
Regarding the math section, could someone please describe it like you were talking to a 5 year old?<p>1)
Θ( log n = log / log n )<p>2)
Θ(log log n)
"Power of 2 Random Choices" ... has nothing to do with the "Power of 2" directly.<p>I like 2Choice because it is not dependent on hash function design & is temporal, but I have a positive aversion to the 2^n hash distributions when it comes to data, specifically for distributed systems which need to flex up/down [1].<p>[1] - <a href="http://notmysock.org/blog/hacks/1440" rel="nofollow">http://notmysock.org/blog/hacks/1440</a>
I've seen a paper doing the same thing directly at the network layer using IPv6 extension headers: <a href="http://www.thomasclausen.net/wp-content/uploads/2017/06/2017-ICDCS-SRLB-The-Power-of-Choices-in-Load-Balancing-with-Segment-Routing.pdf" rel="nofollow">http://www.thomasclausen.net/wp-content/uploads/2017/06/2017...</a>