TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Don't Settle for Eventual Consistency (2014)

91 pointsby alrex021almost 8 years ago

3 comments

taericalmost 8 years ago
Alternatively, don&#x27;t require strong consistency everywhere. Instead, make sure to have it places it makes sense, and always reason about it.<p>I view this as a generalization of optimistic locking on source control. I don&#x27;t envy my past self for having to checkout code with a lock that I was going to change it.
评论 #14726685 未加载
pdexteralmost 8 years ago
See previous comments here <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7632303" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=7632303</a>
ianamartinalmost 8 years ago
I would love to see a Jepsen evaluation of this system. I am not an expert in this area, but it still strikes me as eventually consistent. Discussions around data stores need to be bounded by what the data store can guarantee. Not what optimistic conditions allow for. The paper is pretty light on details about how the dependency graph works. I think that in theory, there is an interesting idea here that basically packages a dependency graph along with every message. But consider this failure case:<p>t0: alice &quot;I lost my wedding ring.&quot; t1: alice &quot;Nevermind, I found it.&quot;<p>Because of the peculiar circumstances around traffic at this moment, the nodes closest to Alice don&#x27;t get that second message. Instead, the nodes closest to bob receive the message, and bob sees the message before alice does. Bob replies, and because everything is okay with the nodes close to him, the message gets passed back to alice. What happens? The node alice is reading from is having issues, so things are re-routed to bob&#x27;s cluster. Bob&#x27;s message has a timestamp that legitimately puts it in between the two messages from alice because the server clocks are off a little.<p>Now, whatever was going wrong with the cluster close to alice has repaired and it receives bob&#x27;s message with a node in the graph that alice&#x27;s cluster is completely unaware of because it hasn&#x27;t caught up.<p>Bob&#x27;s message is telling the cluster, &quot;You can&#x27;t display this message (c) unless you have already acknowledged prior messages (a and b).&quot; Alice&#x27;s cluster says, &quot;I have message a, but b doesn&#x27;t exist.&quot;<p>According to the paper, bob&#x27;s message will never be delivered to alice until the cluster closest to alice sync&#x27;s up with some other remote cluster and her second message is restored. Which, of course, we know can fail to happen with Cassandra, and the projects mentioned are forks of Cassandra.<p>I can&#x27;t prove it formally, but this feels like the problems you run into with messaging queues: you can guarantee at least once delivery or at most once delivery, but not both at the same time. But a different set of tradeoffs: under this theoretical model, you can deliver in order or not at all.<p>Obviously, this is a use case facebook would care about. But I don&#x27;t know that it&#x27;s generally applicable for very many use cases.<p>Unless I&#x27;m missing something important, transmitting the graph doesn&#x27;t really do anything to change the fact that this is eventual consistency and brings all of the problems that come with that. I guess the difference between this and eventual consistency and causal consistency is that one provides out-of-date&#x2F;out-of-order data in an error condition, and the other presumably provides nothing but an error. YMMV.<p>But to a larger point, calling a combined system of client libraries with a forked version of Cassandra some other-category-of-consistency datastore seems to me to be disingenuous. You&#x27;re no longer talking about a data store. You&#x27;re talking about an ecosystem of libraries and developer practices and a datastore that work together to provide some workarounds for the challenge that CAP theorem presents. This is, unless I&#x27;m badly misreading the article, granting a lot of leeway.<p>I also find it unfortunate that the optimized version of this only compared throughput without discussing failure cases at all. CAP theorem is all and only about what can be guaranteed in failure cases. We need to agree from now on that if we&#x27;re going to talk about CAP theorem, that we are talking about guarantees in failure modes, not optimistic modes. Otherwise, there is no value found by invoking the theorem.<p>And this strikes me as a particularly bold assumption given almost anyone&#x27;s real-world experience: &quot;It is partition-tolerant because all reads are in the local data center (partitions are assumed to occur only in the wide area, not in the local data center).&quot;<p>No, kind sir. It is not partition tolerant if that is your assumption. It has already failed the P part of CAP if that&#x27;s an assumption you are relying on.<p>I&#x27;m not trying to shoot the article or the authors down here. Now that CAP theorem has been proved to be a conceptual &quot;pick 2&quot; situation, of course we have to do everything we can to make up for that and do our best, and this seems like a great effort at that.<p>But I think that the authors fundamentally misunderstand the point of writing a paper about a data store and its relationship to CAP theorem. It&#x27;s not about optimistic cases, software development patterns, or libraries that sit on top of the data store.<p>It&#x27;s about guarantees of what happens in failure modes. From reading this paper, all I can tell with certainty is that it&#x27;s not consistent, and it&#x27;s not partition-tolerant. It&#x27;s highly available. To me that seems like it&#x27;s a step backwards from platforms that are, or at least aim to be, CA or AP.
评论 #14724459 未加载