> Our Solution: Use Redis and Postgres Database Polling + leader election using RedLock<p>To me this seems one of the worst architectural choices they could have made. I'm assuming they had some constraints that were not mentioned in the article to justify this decision. But for a simple distributed circuit breaker this seems far from a reasonable solution.<p>Besides that, there is one implicit assumption that was made that I'm not sure is true. By having a 'leader' you are basically saying that your network connections never fail or that they always fail at the same time. Sure, under normal conditions this assumption doesn't seem that bad. But as soon as things start to degrade you may have a pretty bad outage in your hands.<p>If I had to implement this feature I would probably use raft[0] to communicate between pods. And would also keep track of 2 values for each host. One is the local failure rate and the second one is the global failure rate. This way I can change my behavior according to current network conditions.<p>[0] - <a href="https://github.com/hashicorp/raft">https://github.com/hashicorp/raft</a>