TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Gryadka is not Paxos, so it's probably wrong

132 pointsby arjunnarayanabout 8 years ago

13 comments

neuronsguyabout 8 years ago
I&#x27;m including the Gryadka author&#x27;s rebuttal here, for completeness:<p>Thank you for the analysis of my post but it seems that you didn&#x27;t get it correctly. Even to read a value you have to execute the full cycle (prepare, accept) of consensus (in the case of the stable leader we can skip prepare), so when you read the nil value the state of the system will be:<p>A: (value=foo ballot=1 promised=2) B: (value=nil ballot=2 promised=2) C: (value=nil ballot=2 promised=2)<p>Not the one you mentioned in the post:<p>A: (value=foo ballot=1 promised=2) B: (value=nil ballot=0 promised=2) C: (value=nil ballot=0 promised=2)<p>So the counter example is incorrect.<p>I proved the algorithm mathematically by hand and used very aggressive property based testing with fault injections so I&#x27;m pretty confident in its correctness.
评论 #13976207 未加载
mjbabout 8 years ago
This is a very good analysis.<p>I think it makes a slightly stronger argument about MultiPaxos than it needs to. There are other correct ways to use single-decree Paxos, as long as you recognize it&#x27;s limitations. The true limitation is &quot;use every register once&quot;, and a log is the gold standard way to do that. Another correct pattern could be a set-only K&#x2F;V store, or a store for monotonic finite state machine states (at the cost of O(N) rounds for every contender). One real-world example is 2PC coordination, where the state machine is very simple, and can be modeled as an ordered pair of three registers.
评论 #13951115 未加载
评论 #13954154 未加载
RcouF1uZ4gsCabout 8 years ago
<a href="http:&#x2F;&#x2F;www.allreadable.com&#x2F;5b354QWp" rel="nofollow">http:&#x2F;&#x2F;www.allreadable.com&#x2F;5b354QWp</a><p>&quot;every consensus protocol out there or every fully distributed consensus protocol is either Paxos or Paxos with cruft or broken&quot; - Mike Burrows
评论 #13951975 未加载
grogersabout 8 years ago
I haven&#x27;t done any in depth analysis of gryadka but I think the premise of the argument here may be wrong. Even if a particular value isn&#x27;t accepted by a majority of nodes, that doesn&#x27;t necessarily make it a dirty read. As long as anyone participating in the algorithm sees the history as if it was committed right before the cas result is committed, it could still be linearizable. I would need to create a formal model to be sure whether it is correct or not, but don&#x27;t just assume it&#x27;s wrong because it isn&#x27;t paxos (which it isn&#x27;t and it shouldn&#x27;t advertise like it is).
评论 #13954431 未加载
评论 #13976231 未加载
sargunabout 8 years ago
I think the most interesting thing the article states is the following question (and proposed answer):<p>Is it possible to get compare-and-swap without the log?<p>TL;DR: I don’t think so.<p>Riak Ensemble is such a system that utilizes single-decree Paxos in order to try to achieve CAS: <a href="https:&#x2F;&#x2F;github.com&#x2F;basho&#x2F;riak_ensemble" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;basho&#x2F;riak_ensemble</a><p>If the only the distinguished leader is allowed to publish a proposal for value change, then this allows you to preserve the CAS invariant. Unfortunately, this comes at the sacrifice of liveness. If you then have a second Paxos group to elect the leader, then this removes the liveness issue. On view change, you have to contact a quorum of nodes, and re-propose a new epoch in order to avoid issues.<p>I&#x27;m curious if someone can poke a hole in riak_ensemble&#x27;s algorithm.
norswapabout 8 years ago
It would have been nicer to contact the authors of the algorithm before making these (apparently incorrect) claims via blog post. It&#x27;s just sensationalistic.
gtrubetskoyabout 8 years ago
For those wondering, &quot;gryadka&quot; is &quot;bed&quot; as in vegetable&#x2F;garden&#x2F;flower bed in Russian.
评论 #13951868 未加载
elvinyungabout 8 years ago
I know that consensus is probably the cleanest way to guarantee very strict correctness, but I&#x27;m kind of not convinced that synchronous replication is the ideal solution for HA. I have yet to see clear evidence that teams at less-than-Google scale aren&#x27;t mostly getting by with semisynchronous replication, or even just asynchronous replication.<p>Always running an ensemble of three or five nodes for each shard seems to be pretty overkill (and expensive, both in terms of money and latency), especially if it&#x27;s mostly just a way to do automated failovers. I sometimes wonder if there&#x27;s a good-enough cheaper semisync alternative.
评论 #13953592 未加载
评论 #13953556 未加载
评论 #13953593 未加载
nsxwolfabout 8 years ago
That article was like stepping foot on an alien world. I have never even heard of a single one of those things.
评论 #13952055 未加载
评论 #13951878 未加载
joluxabout 8 years ago
What about Raft?
vittoreabout 8 years ago
Can we have TLA+ for Gryadka done by some?
评论 #13954177 未加载
whatnotestsabout 8 years ago
Everybody upvoting this without any discussion.<p>I suppose &quot;It&#x27;s not Paxos, so it&#x27;s probably wrong&quot; is beyond discussion (not that I disagree, of course).
评论 #13952756 未加载
jancsikaabout 8 years ago
&quot;If it’s not Paxos, it’s probably wrong.&quot;<p>Is Bitcoin Paxos?
评论 #13951452 未加载