MongoDB queries don’t always return all matching documents

444 pointsby dan_ahmadialmost 9 years ago

44 comments

Said it before, will say it again... "MongoDB is the core piece of architectural rot in every single teetering and broken data platform I've worked with."The fundamental problem is that MongoDB provides almost no stable semantics to build something deterministic and reliable on top of it.That said. It is really, really easy to use.

评论 #11858393 未加载

评论 #11859084 未加载

评论 #11859851 未加载

评论 #11859808 未加载

评论 #11866909 未加载

评论 #11859260 未加载

评论 #11861737 未加载

评论 #11858902 未加载

lossoloalmost 9 years ago

I've just migrated one project from mongo to postgresql and i advise you to do the same. It was my mistake to use mongo, after I've found memory leak in cursors first day I've used the db which I've reported and they fixed it. It was 2015.. If you have a lot of relations in your data don't use mongo, it's just hype. You will end up with collections without relations and then do joins in your code instead of having db do it for you.

评论 #11858251 未加载

评论 #11857941 未加载

评论 #11858037 未加载

评论 #11858246 未加载

评论 #11861662 未加载

评论 #11858759 未加载

评论 #11858029 未加载

评论 #11858804 未加载

hardwaresoftonalmost 9 years ago

If you're currently using MongoDB in your stack and are finding yourselves outgrowing it or worried that an issue like this might pop up, you owe it to yourself to check out RethinkDB:<a href="https://rethinkdb.com/" rel="nofollow">https://rethinkdb.com/</a>It's quite possibly the best document store out right now. Many others in this thread have said good things about it, but give it a try and you'll see.Here's a technical comparison of RethinkDB and Mongo: <a href="https://rethinkdb.com/docs/comparison-tables/" rel="nofollow">https://rethinkdb.com/docs/comparison-tables/</a>Here's the aphyr review of RethinkDB (based on 2.2.3): <a href="https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfiguration" rel="nofollow">https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfigu...</a>

评论 #11859112 未加载

评论 #11861258 未加载

lathalmost 9 years ago

A lot of Mongo DB bashing on HA. We use it and I love it. Of course we have a dataset suited perfectly for Mongo - large documents with little relational data. We paid $0 and quickly and easily configured a 3 node HA cluster that is easy to maintain and performs great.Remember, not all software needs to scale to millions of users so something affordable and easy to install, use, and maintain makes a lot of sense. Long story short, use the best tool for the job.

评论 #11860166 未加载

评论 #11861313 未加载

danbmil99almost 9 years ago

Oh, the fud of it.The behavior is well documented here <a href="https://jira.mongodb.org/browse/SERVER-14766" rel="nofollow">https://jira.mongodb.org/browse/SERVER-14766</a>and in the linked issues. Seasoned users of mongodb know to structure their queries to avoid depending on a cursor if the collection may be concurrently updated by another process.The usual pattern is to re-query the db in cases where your cursor may have gone stale. This tends to be habit due to the 10-minute cursor timeout default.MongoDB may not be perfect, but like any tool, if you know its limitations it can be extremely useful, and it certainly is way more approachable for programmers who do not have the luxury of learning all the voodoo and lore that surrounds SQL-based relational DB's.Look for some rational discussion at the bottom of this mongo hatefest!

评论 #11861744 未加载

评论 #11863392 未加载

评论 #11862222 未加载

ahachetealmost 9 years ago

Strongly biased comment here, but hope its useful.Have you tried ToroDB (<a href="https://github.com/torodb/torodb" rel="nofollow">https://github.com/torodb/torodb</a>)? It still has a lot of room for improvement, but it basically gives you what MongoDB does (even the same API at the wire level) while transforming data into a relational form. Completely automatically, no need to design the schema. It uses Postgres, but it is far better than JSONB alone, as it maps data to relational tables and offers a MongoDB-compatible API.Needless to say, queries and cursors run under REPEATABLE READ isolation mode, which means that the problem stated by OP will never happen here. Problem solved.Please give it a try and contribute to its development, even just with providing feedback.P.S. ToroDB developer here :)

评论 #11860663 未加载

cachemissalmost 9 years ago

My general feeling is that MongoDb was designed by people who hadn't designed a database before, and marketed to people who didn't know how to use one.Its marketing was pretty silly about all the various things it would do, when it didn't even have a reliable storage engine.Its defaults at launch would consider a write stored when it was buffered for send on the client, which is nuts. There's lots of ways to solve the problems that people use MongoDB for, without all of the issues it brings.

评论 #11861563 未加载

vegabookalmost 9 years ago

I have moved from Mongo to Cassandra in a financial time series context, and it's what I should have done straight from the getgo. I don't see Cassandra as that much more difficult to setup than Mongo, certainly no harder than Postgres IMHO, even in a cluster, and what you get leaves everything else in the dust if you can wrap your mind around its key-key-value store engine. It brings enormous benefits to a huge class of queries that are common in timeseries, logs, chats etc, and with it, no-single-point-of-failure robustness, and real-deal scalability. I literally saw a 20x performance improvement on range queries. Cannot recommend it more (and no, I have no affiliation to Datastax).

评论 #11858867 未加载

评论 #11862070 未加载

jsemraualmost 9 years ago

Weird to see that Mongo is still around. We started to use them on a project ~4 years ago. Easy install, but that's where the problems started. Overall terrible experience. Low performance, Syntax a mess, unreadable documentation.They seem to still have this outstanding marketing team.

paradox95almost 9 years ago

Should an infrastructure company be advertising the fact that it didn't research the technology it chose to use to build its own infrastructure?All these people saying Mongo is garbage are all likely neckbeards sysadmins. Unless you're hiring database admin and sysadmins, Postgres (unless managed - then you have a different set of scaling problems) or any other tradition SQL store is not a viable alternative. This author uses Bigtable as a point of comparison. Stay tuned for his next blog post comparing IIS to Cloudflare.Almost every blog post titled "why we're moving from Mongo to X" or "Top 10 reason to avoid Mongo" could have been prevented with a little bit of research. People have spent their entire life working with the SQL world so throw something new at them and they reject it like the plague. Postgres is only good now because they had to do some of the features in order to compete with Mongo. Postgres been around since 1996 and you're only now using it? Tell me more about how awesome it is.

评论 #11859022 未加载

ruw1090almost 9 years ago

While I love to hate on MongoDB as much as the next guy, this behavior is consistent with read-committed isolation. You'd have to be using Serializable isolation in an RDBMS to avoid this anomaly.

评论 #11858705 未加载

评论 #11858320 未加载

评论 #11858455 未加载

评论 #11858691 未加载

评论 #11858471 未加载

评论 #11859107 未加载

twundealmost 9 years ago

The real problem with Mongo is that it's so enjoyable to start a project with that it's easy to look for ways to continue using it even when Mongo's problems start surfacing. I'll never forget how many problems my team ended up facing with Mongo. Missing inserts, slow queries with only a few hundred records, document size limits. All while Mongo was paraded as web scale in talks.

wzyalmost 9 years ago

Does Meteor support a proper database system yet, a la. MySQL or Postgres?

评论 #11858311 未加载

aavotinsalmost 9 years ago

MongoDB reminds me of an old saying that if you have a problem and you use a regex to solve it, you end up with two problems.I have personally used MongoDB in production two times for fairly busy and loaded projects, and both times I ended up to be the person that encouraged migrating away from MongoDB to a SQL based storage solution. Even at my current job there's still evidence that MongoDB was used for our product, but eventually got migrated to PostgreSQL.Most of the times I've thought that I chose the wrong tool for the right job, which may be true, but still leaves a lot of thought about the correct application. Right now I have a MongoDB anxiety - as soon as I start thinking about maybe using it(with an emphasis on maybe), I remember all the troubles I went through and just forget it.It is certainly not a bad product, but it's a niche product in my opinion. Maybe I just haven't found the niche.

评论 #11861290 未加载

jtchangalmost 9 years ago

This single issue would make me not want to use MongoDB. I'm sure there are design considerations around it but I rather use something that has sane semantics around these edge cases.

Animatsalmost 9 years ago

Not when they're changing rapidly, anyway. Well, that's relaxed consistency for you.Does this guy have so many containers running that the status info can't be kept in RAM? I have a status table in MySQL that's kept by the MEMORY engine; it's thus in RAM. It doesn't have to survive reboots.

fiatjafalmost 9 years ago

CouchDB is simple and reliable. You can understand it from day one. I can't imagine why it isn't being used.

评论 #11858881 未加载

评论 #11859131 未加载

avitalalmost 9 years ago

I believe this is solved by Mongo's "snapshot" method on cursors: <a href="https://docs.mongodb.com/v3.0/faq/developers/#faq-developers-isolate-cursors" rel="nofollow">https://docs.mongodb.com/v3.0/faq/developers/#faq-developers...</a>

评论 #11858745 未加载

rjurneyalmost 9 years ago

Mongo is hilarious. Ease of use is so important, we just don't much give a shit that it has all these gaping holes and flaws in it.

shruubialmost 9 years ago

Seriously, who looks at MongoDB and thinks "this is a sane way of doing things"?To be fair, I've never been much of a fan of the whole NoSQL solution, so I may be biased, but what real benefits do you gain from using NoSQL over anything else?

评论 #11858765 未加载

评论 #11861811 未加载

d3ckardalmost 9 years ago

I worked with MongoDB quite a lot in context of Rails applications. While it has performance issues and can generally become pain because of lack of relations features, it also allows for really fast prototyping (and I believe that Mongoid is much nicer to work with than Active Record).When you're developing MVPs, work with ever changing designs and features, ability to cut off this whole migration part comes around really handy. I would however recommend to anybody to keep migration plan for the moment the product stabilizes. If you don't, you end up in the world of pain.

hendzenalmost 9 years ago

Actually, if this lack of index update isolation is correct, you can get the matching document zero, one or multiple times!

doubleorsevenalmost 9 years ago

Mongo, in one word: sucks. Couchbase, does not.

评论 #11857813 未加载

评论 #11858122 未加载

评论 #11858409 未加载

spullaraalmost 9 years ago

It literally returns wrong answers for queries. I can't believe anyone this thread is defending it.

jitixalmost 9 years ago

What storage engine are you using? I wonder if the same issue comes in wiredtiger MVCC engine.

评论 #11857908 未加载

alkonautalmost 9 years ago

So it's a bit weak in the design department, offers a bit less rigid semantics than one might hope, and from the start it's a technology that was almost a reaction to the rigid and enterprise-y of old.Mongo reminds me a wee bit of JS...

xchaoticalmost 9 years ago

Unless you want to code every rdbms and enterprise feature in the application layer, don't use Minho, use Postgres or Use Marklogic. It is 'nosql', but it is acid compliant and uses MVCC so what the queries return is predictable.

Osirisalmost 9 years ago

I hear a lot about MongoDB's reliability issues. How do CouchDB or other document store database compare in terms of reliability and consistency?

评论 #11859155 未加载

clentaminatoralmost 9 years ago

An interesting read into the development of a project that started using MongoDB and switched to PostgreSQL after eight months in production: <a href="http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/" rel="nofollow">http://www.sarahmei.com/blog/2013/11/11/why-you-should-never...</a>

partycoderalmost 9 years ago

This use-case is not something that you would use MongoDB for. Try Zookeeper.This being said, I would feel embarrassed to post this on behalf of the engineering department of a company.This post is just a very illustrated way of saying "we have no idea about what we are doing and our services are completely unreliable".This is so bad that is more of an HR problem than it is an engineering problem.

评论 #11858782 未加载

tinixalmost 9 years ago

Y'all know other storage engines exist, right?I searched the comments for "percona" and found nothing...Figures.Meanwhile, <a href="https://github.com/percona/percona-server-mongodb/pull/17" rel="nofollow">https://github.com/percona/percona-server-mongodb/pull/17</a>

bbcbasicalmost 9 years ago

Ahhh the Trough of Dissolutionment![1] <a href="https://setandbma.wordpress.com/2012/05/28/technology-adoption-shift/" rel="nofollow">https://setandbma.wordpress.com/2012/05/28/technology-adopti...</a>

xenadu02almost 9 years ago

Use of MongoDB at PlanGrid is probably the single worst technical decision the company ever made.We've migrated our largest collections to Postgres tables and our happiness with that decision increases by the day.

vs2370almost 9 years ago

I am pretty excited about cockroachDb. Its still in beta so not suggested for production use yet, but its being designed pretty carefully and by a great team.. check them out cockroachlabs.com

mouzogualmost 9 years ago

Is MongoDB really that bad?I am someone just getting into Meteor Js and it seems like moving from MongoDB would make it Meteor trickier to learn.Is it difficult to switch to an alternative? Thanks

评论 #11860225 未加载

wvenablealmost 9 years ago

I wonder how much data they are storing and in what pattern that they actually need a NoSQL database. I'm curious why someone would make that choice.

acarreraalmost 9 years ago

If you were inserting changes in the status you'd have much better data and never incur in such issues.

geoPointInSpacealmost 9 years ago

I'm prototyping in meteor using MongoDB and Compute Engine.I have two VM instances in google cloud platform. One is a web app and the other is a MongoDB instance. They are in the same network. The connection I use is their internal IP.Can other people eaves drop between my two instances?

apeacealmost 9 years ago

TL;DR During updates, Mongo moves a record from one position in the index to another position. It does this in-place without acquiring a lock. Thus during a read query, the index scan can miss the record being updated, even if the record matched the query before the update began.

评论 #11858674 未加载

oplessalmost 9 years ago

But it's web scale! </sarcasm>

wizardhatalmost 9 years ago

TLDR: He was reading the database while another process was writing to it.Why all the Mongo hate? I'm sure this would happen with other databases.

评论 #11858928 未加载

评论 #11859595 未加载

评论 #11860553 未加载

throoooowawayalmost 9 years ago

But is your database webscalwebscale? MongoDB is a web scale database.

评论 #11860892 未加载

rgoalmost 9 years ago

Everytime I hear arguments for going back to relational databases, I remember all the scalability problems I lived through for 15 years in relational hell before switching to Mongo.The thing about relational databases is that they do everything for you. You just lay the schema out (with ancient E-R tools maybe) load your relational data, write the queries, indexes, that's it.The problem was scalability, or any tough performance situation really. That's when you realized RDBMSs were huge lock-ins, in the sense that they would require an enormous amount of time to figure out how to optimize queries and db parameters so that they could do that magic outer join for you. I remember queries that would take 10x more time to finish just by changing the order of tables in a FROM. I recall spending days trying different Oracle hints just to see if that would make any difference. And the SQL-way, with PK constraints and things like triggers, just made matters worse by claiming the database was actually responsible for maintaining data consistency. SQL, with its naturalish language syntax, was designed so that businessman could inquire the database directly about their business, but somehow that became a programming interface, and finally things like ORMs where invented that actually translated code into English so that a query compiler could translate that back into code. Insane!Mongo, like most NoSQL, forces you to denormalize and do data consistency in your code, moving data logic into solid models that are tested and versioned from day one. That's the way it's supposed to be done, it sorta screams take control over your data goddammit. So, yes, there's a long way to go with Mongo or any generalistic NoSQL database really, but RDBMS seems a step back even if your data is purely relational.

评论 #11860385 未加载

评论 #11871453 未加载

评论 #11859655 未加载

评论 #11863076 未加载

TimPricealmost 9 years ago

The article is interesting, but title is fud. Besides, all this is not unexpected:> How does MongoDB ensure consistency?> Applications can optionally read from secondary replicas, where data is eventually consistent by default. Reads from secondaries can be useful in scenarios where it is acceptable for data to be slightly out of date, such as some reporting applications.<a href="https://www.mongodb.com/faq" rel="nofollow">https://www.mongodb.com/faq</a>

评论 #11858421 未加载