TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Call Me Maybe: MongoDB Stale Reads

605 pointsby llambdaabout 10 years ago

24 comments

Maroabout 10 years ago
From 2009 to 2012 I had a distributed database startup that competed with MongoDB. We used Paxos for replication and built the database with on-disk consistency guarantees --- like the ones this article looks for and rightly obsesses over --- in mind.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;scalien&#x2F;scaliendb" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;scalien&#x2F;scaliendb</a><p>Outcome: you&#x27;ve never heard of ScalienDB; MongoDB brilliantly won by winning the hearts and minds of hackers and coders who don&#x27;t care about such issues, but were able to get started quickly with Mongo (and got cool free cups at meetups). It turns out that&#x27;s most engineers out there, definitely the initial critical mass to target for a database startup like Mongo.<p>Btw. the story behind Oracle is similar: early versions were basically write-only; read Ellison&#x27;s book &#x27;Softwar&#x27;. Of course there are other ways to get started: for example DBs coming out of academic research like Vertica seem to avoid this problem; in that case initial funding is basically provided by the gov&#x27;t and when they create the company to commercialize they&#x27;re already shooting for Enterprise contracts, skipping the opensource&#x2F;community building phase of Mongo.
评论 #9420727 未加载
评论 #9419959 未加载
评论 #9420165 未加载
bkeroackabout 10 years ago
If you are a database author and you get a bug report from Kyle, spend a <i>long</i> time thinking about it before closing the issue as invalid.
评论 #9418836 未加载
评论 #9418063 未加载
评论 #9418403 未加载
评论 #9418278 未加载
jxfabout 10 years ago
The most interesting lessons from the Jepsen series:<p>* You should never trust, and always verify, the claims made by database manufacturers.<p>* Especially when those claims relate to data integrity.<p>* Super-especially when every safety level provided by the manufacturer that includes the word &quot;SAFE&quot; is actually unsafe.
评论 #9418087 未加载
评论 #9418027 未加载
addisonjabout 10 years ago
Mongo absolutely nailed creating a database that is easy to get started with and even do things that are traditionally more &#x27;hard&#x27; such as replication. It is still super attractive for me to pick it up for small projects, even after dealing with its (many) pain points both in development and operational settings.<p>Given this, it is so tragic to see how dismissive they have been in regards to the consistency issues that have plagued the db since the early days. Whether it was the stupidity of bad defaults in drivers to not confirm writes, or easily corruptible data in the 1.6 days, or now with not seriously looking at the results of jepsen, the mongodb organization has never taken the issues head on. It would be so refreshing to see more transparency and admitting to the faults rather than wiggling around them until eventually pushing a fix buried in patch notes.<p>I often feel like a mongodb apologist when I admit that I don&#x27;t mind using mongo for small (and not important) projects and while the mongodb hate can be a bit extreme at times, the companies treatment of these sorts of issues may justify some of it.
评论 #9418318 未加载
评论 #9418868 未加载
评论 #9418295 未加载
评论 #9418319 未加载
dantiberianabout 10 years ago
There&#x27;s a lot going on here, but the summary is: &quot;What Mongo actually does is allow stale reads: it is possible to execute a WriteConcern=MAJORITY write of a new value, wait for it to return successfully, perform a read with ReadPreference=PRIMARY, and not see the value you just wrote.&quot;<p><a href="https:&#x2F;&#x2F;jira.mongodb.org&#x2F;browse&#x2F;SERVER-17975?focusedCommentId=892980&amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-892980" rel="nofollow">https:&#x2F;&#x2F;jira.mongodb.org&#x2F;browse&#x2F;SERVER-17975?focusedCommentI...</a>
jamescostianabout 10 years ago
I&#x27;m so glad to see the Jepsen series re-instated. Thank you so much Stripe
jxfabout 10 years ago
Question: How do I actually run Kyle&#x27;s tests to see this for myself? (Not that I don&#x27;t believe him, I just want to play around a bit.)<p>When I run `lein install` and then `lein test`, I get:<p><pre><code> ╰─▶ ψ lein test Exception in thread &quot;main&quot; java.io.FileNotFoundException: Could not locate jepsen&#x2F;db__init.class or jepsen&#x2F;db.clj on classpath: , compiling:(mongodb&#x2F;core.clj:1:1) at clojure.lang.Compiler.load(Compiler.java:7142) at clojure.lang.RT.loadResourceScript(RT.java:370) at clojure.lang.RT.loadResourceScript(RT.java:361)</code></pre>
评论 #9418465 未加载
评论 #9418336 未加载
cpksabout 10 years ago
People really underestimate the value of Occasional Consistency. Occasionally Consistent databases, like MongoDB, are great for approximation algorithms, sublinear time algorithms, and similar applications.
评论 #9418218 未加载
评论 #9419550 未加载
geowa4about 10 years ago
Since Postgres added a JSON type and Docker made running it simple in development, I haven&#x27;t had a need for anything else. Call me old school, but I prefer starting with a relational database and changing when it&#x27;s no longer appropriate.
评论 #9423805 未加载
ivanbabout 10 years ago
So what should users of MongoDB do? I&#x27;m asking because it is the main database used in Meteor and I&#x27;m very interested in Meteor.<p>Should the general advice just be &quot;store in MongoDB everything that doesn&#x27;t require consistency and use Postgresql for everything else&quot;?
评论 #9496756 未加载
评论 #9419342 未加载
评论 #9419774 未加载
评论 #9419663 未加载
评论 #9424136 未加载
评论 #9421534 未加载
rdtscabout 10 years ago
I still don&#x27;t get it. MongoDB can&#x27;t possibly call itself a database. I can understand MongoScratchStorage, MongoPorbabilisticDataEngine but not MangoDB.
评论 #9418257 未加载
sylvinusabout 10 years ago
Another instance of Kyle&#x27;s amazing research! You may want to catch him on stage with other great minds at dotScale on June 8: <a href="http:&#x2F;&#x2F;dotscale.io" rel="nofollow">http:&#x2F;&#x2F;dotscale.io</a>
Kiroabout 10 years ago
This article is too technically advanced for me. As a casual MongoDB user, how do these problems affect me?
评论 #9419788 未加载
评论 #9419137 未加载
评论 #9419074 未加载
bsaulabout 10 years ago
I seem to remember from a foundationDB talk that they first spent two years building a simulation environment to control everything from network to persistance for testing scenarios.<p>Does anyone know of any open-source project that would aim at doing the same, so that future NoSQL DB can finally be built on strong foundations ?
narratorabout 10 years ago
I knew something was funny with Mongo when all the api calls defaulted to writes not being guaranteed to sync to disk. Maybe for a use case like aggregate statistics gathering it would be ok to risk missing a few updates in a crash for the sake of speed, but to make that the default??
评论 #9420877 未加载
评论 #9418288 未加载
lobo_tuertoabout 10 years ago
I think it would be great to see one of these done for RethinkDB :)
评论 #9418170 未加载
bakhyabout 10 years ago
I must admit, I always feel like I am missing something in these discussions. Like I didn&#x27;t get some memo... I just don&#x27;t expect a DB like MongoDB to guarantee consistency. The whole story around NoSQL and the likes was to enable crazy horizontal scaling needed for the web. Phrases like &quot;eventual consistency&quot; flew around. It seems so logical - you lose consistency, gain scalability.<p>But somehow, people simply started using them everywhere? Assuming that these DBs are just like any other? And now, we&#x27;re all bashing on MongoDB because it is - not consistent? What happened here? :)<p>NB that I do not wish to attack the OP - if MongoDB now claims to be consistent in any way, that deserves scrutiny. And these analyses are always a really interesting read. But the general tone in the developer community about MongoDB seems a bit irrational.
评论 #9420443 未加载
评论 #9420417 未加载
评论 #9421855 未加载
agopaulabout 10 years ago
So, now I&#x27;m wondering: why is Stripe using Mongo at all? Maybe they are planning to migrate to another DBMS?
ccleveabout 10 years ago
Does anyone have any references on how you <i>could</i> write a distributed database that met all ACID properties? Surely there&#x27;s an academic paper that says that if you do A then B then C, you are guaranteed a certain level of consistency.<p>We&#x27;ve developed a type of distributed database at my company, and I think it&#x27;s pretty solid, but I need a broader familiarity with the available theory.
评论 #9418629 未加载
评论 #9418356 未加载
评论 #9419802 未加载
评论 #9419293 未加载
posnetabout 10 years ago
Would the use of wired tiger as a storage engine affect these results?
评论 #9418048 未加载
评论 #9418388 未加载
评论 #9418056 未加载
chatmanabout 10 years ago
Apache Solr has done very well at Jepsen tests.
chucksmartabout 10 years ago
Maybe we should listen to Larry Ellison when he say &quot;gimme my money!&quot;
评论 #9418819 未加载
pjeabout 10 years ago
upvoted for the Look Around You link alone.
评论 #9418132 未加载
threeseedabout 10 years ago
Shame this wasn&#x27;t done with the latest version 3.0. Although given that improvements are scheduled for 3.1 I would imagine it might be still an issue.<p>Nice writeup either way though. Would like to see a similar article for Couch* and MySQL.