TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The Raft Consensus Algorithm (2015)

343 pointsby oumua_don17over 1 year ago

19 comments

lpageover 1 year ago
Maelstrom [1], a workbench for learning distributed systems from the creator of Jepsen, includes a simple (model-checked) implementation of Raft and an excellent tutorial on implementing it.<p>Raft is a simple algorithm, but as others have noted, the original paper includes many correctness details often brushed over in toy implementations. Furthermore, the fallibility of real-world hardware (handling memory&#x2F;disk corruption and grey failures), the requirements of real-world systems with tight latency SLAs, and a need for things like flexible quorum&#x2F;dynamic cluster membership make implementing it for production a long and daunting task. The commit history of etcd and hashicorp&#x2F;raft, likely the two most battle-tested open source implementations of raft that still surface correctness bugs on the regular tell you all you need to know.<p>The tigerbeetle team talks in detail about the real-world aspects of distributed systems on imperfect hardware&#x2F;non-abstracted system models, and why they chose viewstamp replication, which predates Paxos but looks more like Raft.<p>[1]: <a href="https:&#x2F;&#x2F;github.com&#x2F;jepsen-io&#x2F;maelstrom&#x2F;">https:&#x2F;&#x2F;github.com&#x2F;jepsen-io&#x2F;maelstrom&#x2F;</a><p>[2]: <a href="https:&#x2F;&#x2F;github.com&#x2F;tigerbeetle&#x2F;tigerbeetle&#x2F;blob&#x2F;main&#x2F;docs&#x2F;DESIGN.md">https:&#x2F;&#x2F;github.com&#x2F;tigerbeetle&#x2F;tigerbeetle&#x2F;blob&#x2F;main&#x2F;docs&#x2F;DE...</a>
评论 #37375261 未加载
评论 #37373040 未加载
评论 #37378097 未加载
评论 #37371297 未加载
eatonphilover 1 year ago
I had a fun time recently implementing Raft leader election and log replication (i.e. I didn&#x27;t get to snapshotting&#x2F;checkpointing). One of the most challenging projects I&#x27;ve tried to do.<p>The Raft paper is very gentle to read, and gives you a great intuition on its own. Even if you don&#x27;t want to implement it, you probably use software that uses it: like etcd or consul or cockroach or tidb, etc.<p>I collected all the resources I found useful while doing it here: <a href="https:&#x2F;&#x2F;github.com&#x2F;eatonphil&#x2F;goraft#references">https:&#x2F;&#x2F;github.com&#x2F;eatonphil&#x2F;goraft#references</a>. This includes Diego Ongaro&#x27;s thesis and his TLA+ spec.<p>Some people say Figure 2 of the Raft paper has everything you need but I&#x27;m pretty sure that&#x27;s just not true. It&#x27;s a little bit more vague than looking at the TLA+ spec to me anyway.
评论 #37370746 未加载
评论 #37378738 未加载
评论 #37370491 未加载
galenmarchettiover 1 year ago
If you&#x27;re interested in consensus algorithms, you might be interested in this book that I used in a theoretical course on distributed system called &quot;Reasoning about Knowledge&quot; (<a href="https:&#x2F;&#x2F;mitpress.mit.edu&#x2F;9780262562003&#x2F;reasoning-about-knowledge&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;mitpress.mit.edu&#x2F;9780262562003&#x2F;reasoning-about-knowl...</a>).<p>You have to invest a bit in learning about modal logic, but once you do get past that part, this book provides proofs of why things like Raft or Paxos work that are super intuitive and straightforward. Basically pushing the complexity of proving these algorithms into the logic structure used to form the proof (in an intuitive way). Highly recommend, changed how I think about consensus!
henrik_wover 1 year ago
Can&#x27;t resist posting this (from Classic Programmer Paintings)<p>“Raft Consensus Algorithm Failure”,Théodore Géricault, 1819<p><a href="https:&#x2F;&#x2F;classicprogrammerpaintings.com&#x2F;post&#x2F;614108749635928064&#x2F;raft-consensus-algorithm-failure-th%C3%A9odore" rel="nofollow noreferrer">https:&#x2F;&#x2F;classicprogrammerpaintings.com&#x2F;post&#x2F;6141087496359280...</a>
评论 #37376372 未加载
lucb1eover 1 year ago
If anyone else doesn&#x27;t understand what the visualisation is supposed to show, note that you can click on one of the nodes and make them fail. Particularly try this with the current &quot;leader&quot; (the thing that&#x27;s sending and receiving all the packets). Press the little pause icon next to the first slider to turn it back into a clock and resume the simulation.<p>Has someone else figured out what the spreadsheet on the right is? It looks broken to me (but so I thought the rest of the simulation was before understanding that it only shows the happy flow by default), as it always remains empty. The clickable elements I discovered so far are the two sliders, the clock&#x2F;pause icon, and the individual servers.
评论 #37372417 未加载
benreesmanover 1 year ago
The last time I was working in a setting where a rock-solid Chubby-alike under serious load was constantly top of mind was some years ago, and at that time you used ZK if failure wasn’t an option.<p>But AFAIK people have been putting heavy, heavy work on Raft-based options like etcd and Consul and others for many years now.<p>Is one of those systems like the new best default? Certainly the conceptual clarity and elegance of Raft seem like things that would show up in performance and reliability, but I’m just dated on this.<p>What are people (who aren’t at Google or married to GCP) using as the best practices default when the stakes are high in 2023?<p>I think there’s a production-grade Rust implementation of Raft from IIRC TikV, and a rock-solid, high performance lock server seems squarely in the sweet spot for Rust. Are people using that?
bjornasmover 1 year ago
Here is their answer to their own question - &quot;What is Raft?&quot;<p>&gt;Raft is a consensus algorithm that is designed to be easy to understand. It&#x27;s equivalent to Paxos in fault-tolerance and performance. The difference is that it&#x27;s decomposed into relatively independent subproblems, and it cleanly addresses all major pieces needed for practical systems. We hope Raft will make consensus available to a wider audience, and that this wider audience will be able to develop a variety of higher quality consensus-based systems than are available today.<p>After reading that I still have no idea. They are not alone in doing this, but I think its a shame that people don&#x27;t spend the extra time and effort in properly describing their work.
评论 #37374899 未加载
评论 #37375247 未加载
Verviousover 1 year ago
Consensus protocol researcher here. For what it’s worth, I think that the plethora of blockchain research in the last 10 years has made consensus much easier to understand. Raft (in particular, with all of its subtleties) reads (and implements) like Greek in comparison.<p>For a new beginner to consensus protocols, today, I would start them with Bitcoin, and then move onto Paxos&#x2F;Tendermint&#x2F;Simplex, and skip Raft entirely. (Simplex is my paper, a simplified version of PBFT).
评论 #37376769 未加载
kevdevover 1 year ago
I love this site. When I was learning &amp; implementing raft in my distributed systems course, this page was invaluable. Plus the paper itself is pretty easy to read.
评论 #37370583 未加载
jmhollaover 1 year ago
Are there consensus algorithms that don&#x27;t require changes go through a leader? In many distributed systems, you want to distribute intake as well.
评论 #37370725 未加载
评论 #37378135 未加载
评论 #37371786 未加载
评论 #37370903 未加载
评论 #37371186 未加载
评论 #37372985 未加载
评论 #37370675 未加载
评论 #37375870 未加载
评论 #37370658 未加载
评论 #37373272 未加载
评论 #37370628 未加载
henrik_wover 1 year ago
When I was studying the Raft algorithm a year and a half ago, I found this video on it by John Ousterhout to be a good complement:<p>Designing for Understandability: The Raft Consensus Algorithm<p><a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=vYp4LYbnnW8">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=vYp4LYbnnW8</a>
63over 1 year ago
&gt; Raft is a consensus algorithm that is designed to be easy to understand.<p>&gt; Consensus typically arises in the context of replicated state machines, a general approach to building fault-tolerant systems.<p>I recognize that I&#x27;m not the intended audience but I do think I would be a lot more capable of understanding this article if it used less jargon or at least defined what it meant. I&#x27;m only mentioning this because ease of understanding is an explicit goal.<p>Can someone give a real world example of where this would be used in a production app? I&#x27;m sure it&#x27;s very practical but I&#x27;m getting caught up in trying to understand what it&#x27;s saying
评论 #37370665 未加载
评论 #37370925 未加载
评论 #37370669 未加载
评论 #37370823 未加载
评论 #37374044 未加载
maxpertover 1 year ago
I&#x27;ve written a whole SQLite replication system that works on top of RAFT ( <a href="https:&#x2F;&#x2F;github.com&#x2F;maxpert&#x2F;marmot">https:&#x2F;&#x2F;github.com&#x2F;maxpert&#x2F;marmot</a> ). Best part is RAFT has a well understood and strong library ecosystem as well. I started of with libraries and when I noticed I am reimplementing distributed streams, I just took off the shelf implementation (<a href="https:&#x2F;&#x2F;docs.nats.io&#x2F;nats-concepts&#x2F;jetstream" rel="nofollow noreferrer">https:&#x2F;&#x2F;docs.nats.io&#x2F;nats-concepts&#x2F;jetstream</a>) and embedded it in system. I love the simplicity and reasoning that comes with RAFT. However I am playing with epaxos these days (<a href="https:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~dga&#x2F;papers&#x2F;epaxos-sosp2013.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~dga&#x2F;papers&#x2F;epaxos-sosp2013.pdf</a>), because then I can truly decentralize the implementation for truly masterless implementation. Right now I&#x27;ve added sharding mechanism on various streams so that in high load cases masters can be distributed across nodes too.
htowerad3242over 1 year ago
It&#x27;s a shame many undergraduate CS curricula are allergic to distributed systems and type systems.<p>Even the grad program I was looking at is hot dog water.<p>I&#x27;ve been playing with raft and paxos. Employers will not care as these were learned out-of-band from degree mills.
EGregover 1 year ago
Is Raft byzantine-fault-tolerant though? Can it be made so?<p>Paxos can: <a href="http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Paxos_(computer_science)#Byzantine_Paxos" rel="nofollow noreferrer">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Paxos_(computer_science)#Byzant...</a>
评论 #37370494 未加载
评论 #37370592 未加载
评论 #37370455 未加载
评论 #37372559 未加载
评论 #37370856 未加载
评论 #37371706 未加载
dangover 1 year ago
I took a crack at finding the interesting past related threads. Any others?<p>(I&#x27;ve left out posts about particular implementations, extensions, and so on—there are too many. The intention is threads about the algorithm itself.)<p><i>Raft Is So Fetch: The Raft Consensus Algorithm Explained Through Mean Girls</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33071069">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33071069</a> - Oct 2022 (53 comments)<p><i>Raft Consensus Animated (2014)</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32484584">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32484584</a> - Aug 2022 (67 comments)<p><i>Why use Paxos instead of Raft?</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32467962">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=32467962</a> - Aug 2022 (45 comments)<p><i>In Search of an Understandable Consensus Algorithm (2014) [pdf]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29837995">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29837995</a> - Jan 2022 (12 comments)<p><i>Raft Consensus Protocol</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29079079">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=29079079</a> - Nov 2021 (51 comments)<p><i>Paxos vs. Raft: Have we reached consensus on distributed consensus?</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27831576">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=27831576</a> - July 2021 (48 comments)<p><i>Raft Visualization</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=25326645">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=25326645</a> - Dec 2020 (35 comments)<p><i>Raft: A Fantastical and Absurd Exploration</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23129707">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23129707</a> - May 2020 (1 comment)<p><i>Understanding Raft Consensus</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23128787">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23128787</a> - May 2020 (3 comments)<p><i>In Search of an Understandable Consensus Algorithm (2014) [pdf]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23113419">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23113419</a> - May 2020 (26 comments)<p><i>Paxos vs. Raft: Have we reached consensus on distributed consensus?</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22994420">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22994420</a> - April 2020 (65 comments)<p><i>Raft Is So Fetch: The Raft Consensus Algorithm Explained Through Mean Girls</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22520040">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22520040</a> - March 2020 (4 comments)<p><i>Implementing Raft: Part 2: Commands and Log Replication</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22451959">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=22451959</a> - Feb 2020 (16 comments)<p><i>Building a Large-Scale Distributed Storage System Based on Raft</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=21447528">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=21447528</a> - Nov 2019 (5 comments)<p><i>In Search of an Understandable Consensus Algorithm [pdf]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14724883">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14724883</a> - July 2017 (14 comments)<p><i>Instructors&#x27; Guide to Raft</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=11300428">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=11300428</a> - March 2016 (3 comments)<p><i>Fuzzing Raft for Fun and Publication</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10432062">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10432062</a> - Oct 2015 (10 comments)<p><i>Prove Raft Correct</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10017549">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=10017549</a> - Aug 2015 (27 comments)<p><i>Scaling Raft</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9725094">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9725094</a> - June 2015 (12 comments)<p><i>Raft Consensus Algorithm</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9613493">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9613493</a> - May 2015 (24 comments)<p><i>Creator of Raft is speaking at our meetup. What questions do you want answered?</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9351794">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9351794</a> - April 2015 (6 comments)<p><i>Replicating SQLite using Raft Consensus</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9092110">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9092110</a> - Feb 2015 (21 comments)<p><i>Raft Refloated: Do We Have Consensus? [pdf]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9015085">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=9015085</a> - Feb 2015 (4 comments)<p><i>Analysis of Raft Consensus [pdf]</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8736868">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8736868</a> - Dec 2014 (3 comments)<p><i>The Raft Consensus Algorithm</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8527440">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8527440</a> - Oct 2014 (27 comments)<p><i>Raft: Understandable Distributed Consensus</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8271957">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=8271957</a> - Sept 2014 (79 comments)<p><i>Raft - The Understandable Distributed Protocol</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6859101">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=6859101</a> - Dec 2013 (10 comments)<p><i>Raft, a scrutable successor to Paxos</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=5624627">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=5624627</a> - April 2013 (2 comments)
candiddevmikeover 1 year ago
Anyone using object storage like S3 for cluster coordination&#x2F;election instead?
评论 #37371500 未加载
评论 #37371149 未加载
评论 #37373295 未加载
valzamover 1 year ago
Obligatory plug: <a href="http:&#x2F;&#x2F;nil.csail.mit.edu&#x2F;6.824&#x2F;2022&#x2F;schedule.html" rel="nofollow noreferrer">http:&#x2F;&#x2F;nil.csail.mit.edu&#x2F;6.824&#x2F;2022&#x2F;schedule.html</a><p>MIT Distributed Systems course where Raft is implemented as a class assignment (assignment 2). The test suite around the assignment is incredibly valuable and !will! find bugs in your Raft implementation. The assignment is broken up into distinct steps to help people not get stuck when doing everything at once. It is still very challenging to implement everything, especially to get the performance tests to pass.
badcarbineover 1 year ago
ELI5
评论 #37371098 未加载
评论 #37370928 未加载
评论 #37371773 未加载