TechEcho

10 comments

This is up for debate but this[0], IMHO, is pretty much the gold standard of explaining distributed algorithms.[0] - <a href="http://thesecretlivesofdata.com/raft/" rel="nofollow">http://thesecretlivesofdata.com/raft/</a>

评论 #8807574 未加载

评论 #8807520 未加载

评论 #8807013 未加载

评论 #8807139 未加载

评论 #8807521 未加载

ahelwerover 10 years ago

Software engineers love Paxos because it takes something very complex (a distributed system) and makes it equivalent to working with a single machine: you only ever talk to the leader. It gives you redundancy at the expense of performance.Paxos is used to achieve something called Strong Consistency, where each node sees the same message in the same order. If you think of each node as a deterministic state machine, they are guaranteed to end up in the same state after responding to the same sequence of messages. It's nice and intuitive, but requiring global synchronization on every write is terrible for performance.Other consistency schemes exist. A popular one is Eventual Consistency, where writes are made immediately at any node (not just the leader) and the system is expected to synchronize in the background and "converge" to the same state. However, this can result in merge conflicts: if you're editing a document in collaboration with other users, what if you edit a word in a paragraph while another user deletes that entire paragraph? Does the system resolve this automatically, or require user assistance? The answer to this question varies according to system requirements. I think most HN users have experienced the joys of resolving merge conflicts.A newer model is something called Strong Eventual Consistency, which is similar to Eventual Consistency but merge conflicts are impossible by design: every update to the system must be commutative, associative, and idempotent with other updates. It is not always possible to design your system this way. These systems are implemented with Conflict-Free Replicated Data Types (or ad-hoc equivalents) and have excellent liveness/throughput/performance characteristics compared to Strong Consistency.CRDTs are not as simple as Paxos. You're forced out of the cozy one-system world and your system must deal with two nodes concurrently holding different values. For most applications, magic Paxos dust is all you need. For others, CRDTs are an excellent tool.

评论 #8807625 未加载

评论 #8809525 未加载

评论 #8807426 未加载

emin-gun-sirerover 10 years ago

This well-illustrated post is technically about the core Synod agreement protocol in Paxos. Building a consistent distributed service on top requires additional scaffolding and infrastructure. Typically, people layer on a system that implements a "replicated state machine (RSM)" on top, which maintains the illusion of a single consistent object, even though it is composed of distributed replicas.Also keep in mind that Raft, Zab, and View-Stamped replication (in reverse chronological order) are alternatives to the Synod protocol in Paxos. These protocols differ from Paxos by employing a different leader-election mechanism and slightly different way of maintaining their invariants.There have been many Paxos variants. This site [1] shows the various Paxos variants over a timeline and points out their contributions.Those of you interested in building replicated state machines using Paxos should take a look at OpenReplica [2]. It is a full Multi-Paxos implementation that takes any Python object and makes it distributed and fault-tolerant, like an RPC package on steroids.[1] <a href="http://paxos.systems/" rel="nofollow">http://paxos.systems/</a>[2] <a href="http://openreplica.org/faq/" rel="nofollow">http://openreplica.org/faq/</a>

评论 #8807384 未加载

ameliusover 10 years ago

Also interesting: [1]> Raft is a consensus algorithm that is designed to be easy to understand. It's equivalent to Paxos in fault-tolerance and performance. The difference is that it's decomposed into relatively independent subproblems, and it cleanly addresses all major pieces needed for practical systems. We hope Raft will make consensus available to a wider audience, and that this wider audience will be able to develop a variety of higher quality consensus-based systems than are available today.[1] <a href="https://raftconsensus.github.io/" rel="nofollow">https://raftconsensus.github.io/</a>

评论 #8808140 未加载

评论 #8807266 未加载

ashover 10 years ago

> Honest-to-goodness real-life implementations of Paxos can be found at the heart of … Google’s magnificent Spanner database…I'm not sure about Spanner and Paxos. Sebastian Kanthak said during his Google Spanner talk:"If you've been to the Raft talk this morning, our Paxos implementation is actually closer to the Raft algorithm than to what you'd read in the Paxos paper, which is… if you haven't read it, don't read it, it's horrible." (at 7:43)<a href="http://www.infoq.com/presentations/spanner-distributed-google" rel="nofollow">http://www.infoq.com/presentations/spanner-distributed-googl...</a>

Animatsover 10 years ago

Nice animations.How do you keep a broken or hostile node from advancing the sequence number to the end of the sequence number space?There's an algorithm from one of Butler Lamson's grad students at MIT which fixes this, but it seems to require one more message per cycle. (<a href="http://pmg.csail.mit.edu/~castro/thesis.pdf" rel="nofollow">http://pmg.csail.mit.edu/~castro/thesis.pdf</a>) That paper later appears as a Microsoft Research paper on how to make an NFS-like file system with this consensus properly. Did Microsoft ever put that in a product?

lordnachoover 10 years ago

Having looked for a few minutes, it really reminds me of the routing protocols used for distributing routes in networks. (Also Layer 2 stuff IIRC). There you also find heartbeats, elections, etc.Is there a connection?Also, does it have anything to do with the Byzantine Generals Problem?

评论 #8807169 未加载

评论 #8807145 未加载

subbuover 10 years ago

The animations are a bit fast. Would've been great if the reader could control them. Easier to learn.

_almosnowover 10 years ago

> Side note: it’s important that no two proposers ever use the same sequence number, and that they are sortable, so that they truly reference only one proposal, and precedence between proposals can be decided using a simple comparison.They are moving the core problem into a different domain. Worst explanation of PAXOS ever... nice animations though.Edit: 'Worst explanation' is just an exaggeration, obv. It is nice, but doesn't explain really important issues.

评论 #8808799 未加载

fitshipitover 10 years ago

> Know Paxos? Stealth-mode big data startup is hiring founding engineerlol

10 comments

krat0sprakharover 10 years ago

评论 #8807574 未加载

评论 #8807520 未加载

评论 #8807013 未加载

评论 #8807139 未加载

评论 #8807521 未加载

ahelwerover 10 years ago

评论 #8807625 未加载

评论 #8809525 未加载

评论 #8807426 未加载

emin-gun-sirerover 10 years ago

评论 #8807384 未加载

ameliusover 10 years ago

评论 #8808140 未加载

评论 #8807266 未加载

ashover 10 years ago

Animatsover 10 years ago

lordnachoover 10 years ago

评论 #8807169 未加载

评论 #8807145 未加载

subbuover 10 years ago

The animations are a bit fast. Would've been great if the reader could control them. Easier to learn.

_almosnowover 10 years ago

评论 #8808799 未加载

fitshipitover 10 years ago

> Know Paxos? Stealth-mode big data startup is hiring founding engineerlol

Neat Algorithms: Paxos

10 comments

Neat Algorithms: Paxos

10 comments