TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Stonebraker: Clarifications on the CAP Theorem and Data-Related Errors

25 点作者 ora600超过 14 年前

5 条评论

dasht超过 14 年前
A quick summary follows the quick editorial and after that a quick new thought:<p>Editorial: Stonebraker is, imo, and as usual, Right Again. His biggest problem is that he's boring that way. He doesn't open his mouth in contexts like this but to be Right.<p>Summary: People say "No SQL is right because of the CAP theorem." The CAP "theorem" says of DBs that: Consistency, high Availability, or Partition-tolerance .... pick any two. Quite true! So one of the pro no-SQL arguments is that high availability and partition tolerance are often the priorities ... so toss out consistency! SQL assumes consistency. Thus you need No SQL. Stonebraker correctly points out that, hey, you know what? Partitions are pretty rare and tossing out consistency really didn't increase your accessibility average by much .... so you tossed out consistency for no reason whatsoever. If you think the Cap "theorem" justifies NoSQL... you're just wrong.<p>Stonebraker's rant is nearly boring because it makes such an obvious point.<p>New Thought: I don't think NoSQL is popular because of the CAP theorem. I think it is popular because it is easier to get started with (even if that means using it poorly) than SQL. SQL is a little hard to learn. It's a little bit awkward to use in some "scripting" language or other HLL language. NoSQL may be bad engineering in many of its uses ... but its easier. A lot easier. And, people aren't much asking about engineering quality until sites start failing often. Which a heck of a lot of them do but by then the NoSQL architects have collected their money and are out of town or else are still around but able to point fingers of blame away from the abandonment of ACID.<p>An ACID DB that gave a more simple-minded logical model than SQL ... including, sure, relaxing ACID constraints where that was really desirable ... could go a long way fixing the confusion around NoSQL.<p>p.s.: given a typical distributed NoSQL DB, one thing you could do is regard that as the "physical model", implement proper transactions, and build a library that gave you ACID properties. Build up a high level way of using that library so that you have a logical model of the data that is independent of exactly how it is laid out in the underlying thing.... and you've got a Codd-style DB. Great thing to do. SQL 2.0
评论 #1817943 未加载
评论 #1817951 未加载
评论 #1817970 未加载
jchrisa超过 14 年前
This is the comment I left on the post (still moderating):<p>There is an extreme case of partition tolerance that must be considered: disconnected operation.<p>For users at the edge of the network, latency can be the biggest performance killer. If it takes 1 second or more for each user action to be reflected in application state due to round trip time (mobile web) those seconds add up and users can be frustrated.<p>However, if you move the database and web application to the mobile device itself, users no longer see network latency as part of the user experience critical path. Latency has been proven to be correlated directly to revenue, because users engage much more readily with snappy interfaces.<p>Once data is being operated on by the user on the local device, the key becomes synchronization. Asynchronous multi-master replication demands a different approach to consistency, than the traditional model which assumes the database is being run by a central service.<p>The MVCC document model is designed for synchronization. It's a different set of contraints than the relational model, but since it's such a highly constrained problem space it also admits of general solutions and protocols.<p>It's my belief that the MVCC document model is closer to the 80% solution for a large class of applications. Storing strongly typed and normalized representations of data is an artifact of our historically constrained computing resources, so it will always be a good way to optimize certain problems.<p>But for many human-scale data needs, schemaless documents are a very good fit. They optimize for the user, not the computer.
评论 #1818041 未加载
logicalstack超过 14 年前
He seems to assume a lot in this post, for instance his 200 vs 4 node comparison assumes that you have 200 nodes because the poor performance of your DBMS requires that many nodes. If that's the case, great, use voltdb. If not, it's perfectly reasonable to think that 200 nodes would have more network partitions than 4 nodes which is why one would pick an AP system in the first place.
jnewland超过 14 年前
tl;dr version: <a href="http://files.jnewland.com/stonebraker-20101021-200946.jpg" rel="nofollow">http://files.jnewland.com/stonebraker-20101021-200946.jpg</a>
评论 #1818047 未加载
moonpolysoft超过 14 年前
Wherein Stonebraker misunderstands distributed systems engineering completely.
评论 #1817799 未加载