Mongodb – not evil, just misunderstood

55 点作者 Kristories超过 11 年前

17 条评论

The "no migrations" claim is a lie. Migrations still exist, they just happen at the time of reading:<pre><code> def parseMongoRow(jsonBlob): x = None if jsonBlob['date'] < 2012/12/02: x = jsonBlob['foo']['bar'] else if jsonBlob['date'] < 2013/06/01: x = mergeFields(jsonBlob['bar'], jsonBlob['baz']) else: x = jsonBlob['x'] ... </code></pre> For expiring fields in a redis database, this isn't a big deal. Your code stinks between [time of migration, time of migration + ttl]. For a permanent datastore, ouch.I also do not understand why they are using MongoDB for this. They describe a schema involving shipments, shippoints and orders, and one of these things does not occur without the other (an attempt at justifying the document based data model?). I.e., something like this:<pre><code> CREATE TABLE orders (...) CREATE TABLE shipments ( order_id BIGINT REFERENCES orders(id) NOT NULL, ship_from_id BIGINT REFERENCES shippoints(id) NOT NULL, ship_to_id BIGINT REFERENCES shippoints(id) NOT NULL ) </code></pre> You can't have a shipment without a shippoint. Black magic!They have several hundred shipments/day, and lets be generous and assume there are 100 updates/notes to a shipment. We are talking maybe 50,000 inserts/day (omfg big data, invest in a 1TB hard disk!) and it sounds like data that's considerably more important than an ad impressions or pageviews. (Well actually it fits in ram, so maybe the 1TB hard disk on a dedicated server is overkill.)Also, consider the daily aggregate generator in 3 lines of SQL rather than 236 lines of javascript:<pre><code> SELECT date, carrier, zone, COUNT(id), SUM(price) FROM shipments GROUP BY date, carrier, zone; </code></pre> I don't get it. How is mongodb even remotely the right tool for this job?

评论 #6860387 未加载

评论 #6860563 未加载

评论 #6860752 未加载

specialp超过 11 年前

I have been using MongoDB since the beta days but after many replica set/sharding bugs springing up, I am less inclined to use it. That was the major feature we used if for. Addressing this article:-Yes you can make aggregate results using map reduce (In javascript with a single core, idempotent) and you can store the aggregates as separate documents. I do not see what is so magical about this.-It is true that you do not have a set schema thus no migrations, but you do have to have a schema in mind for nearly everything as something has to make sense of that field.If you add fields to a document and you already have documents in there, you are going to have to add that data to them unless you want them to be nil. If you rename a field you have to iterate all documents and add rename it. If you remove a field you have to remove it from all existing documents if you do not want to have that data around anymore. Certainly you do not HAVE to do any of this but if you are fetching meaningful data from a field you need to.And you need to index on whatever you want to search on or the performance is terrible as Mongo is not very good on the disk so you have to keep that in mind too. I think most people were attracted to Mongo due to "no migrations" and schema but when you work with it almost any use case you find yourself using an unenforced schema and doing script based "migrations" The recent Mongo criticism is that it markets itself as a all purpose DB but in reality document stores are not that.

rdtsc超过 11 年前

> How can the default writes be fire and forget? It just made sense, given all the information to configure it the way you prefer I would always go with this approachYes, we should bring unacknowledged writes as a default back.(Often a technology's greatest detractors are its own fans, they are digging its grave without even realizing).

nailer超过 11 年前

Why I (most recently) dislike Mongo:The current stable official node driver silently wraps all exceptions, including in unrelated code launched from DB callbacks. This is acknowledged by MongoDB inc.<a href="https://groups.google.com/forum/#!topic/node-mongodb-native/AZewJaX4YuE" rel="nofollow">https://groups.google.com/forum/#!topic/node-mongodb-native/...</a>

p4lindromica超过 11 年前

I think this article misunderstands the implications of the fire and forget write concern. Fire and forget means that mongo has acknowledged a write that may not be persisted to disk. It appears the author uses fire and forget for notes, which are acceptable to lose, and tracking information, which are not acceptable to lose.The author states the only downside of fire and forget is that data may not be available on subsequent queries. While this may be true, this should not be what you are worrying about. The downside is that when your primary crashes, or becomes overloaded to the point where your slaves all are lagging significantly and you need to switch primaries, the data is inconsistent and you will lose that acknowledged write.

djur超过 11 年前

MongoDB isn't necessarily bad, but there's a number of valid reasons it's received a lot of bad publicity.It was a high-profile representative of the NoSQL trend and that trend is experiencing a backlash. Since it has a relatively easy learning curve and a JavaScript-centric interface (JS also beng trendy), it got a lot more uptake by startups and inexperienced developers wanting to learn about NoSQL (or move away from the supposedly moribund relational model).The ad-hoc querying makes it tempting to build object mappers around Mongo and write the kind of code you would with an ORM. A lot of NoSQL refugees did exactly that. You can get pretty far doing that before the caveats with that approach kick in. It's much harder to fall into that trap with something like Riak.And then MongoDB itself has had some poorly chosen defaults and design decisions. These wouldn't be so high-profile if it wasn't for the other issues.

coldtea超过 11 年前

>The advantages of schemaless documents are priceless. Not having to migrate is just one of the perks.Was the first sign that this person doesn't know what he is talking about...Of course you still have to migrate. Either your DB or your code, because you can't handle a change in how you store stuff without one or the other."No migrations" is true only for the most trivial of changes, and only if you're willing to handle special cases in your code.

评论 #6861208 未加载

codex超过 11 年前

For every positive article on MongoDB there are ten which are negative. Now that's a strong brand.

评论 #6860298 未加载

评论 #6861166 未加载

评论 #6859736 未加载

tilsammans超过 11 年前

I guess I am old-school but you are listing a bunch of reasons why MongoDB is not evil, yet each and every one of these reasons turns out to be extremely risky business. All of which simply do not apply with relational datastores. The thing I took away from your post is that had you used a 20-year-old relational datastore you would have 0 of your issues anyway.> The advantages of schemaless documents are priceless. Not having to migrate is just one of the perks. Our schemas were largely in the form of Orders (having many) Shipments (going_from) ShipPoint (to) ShipPointYou say priceless. I don't think it means what you think it means. A migration is costly but also pretty rare. I migrate PostgreSQL with 100k+ rows as a matter of routine, it's over before you know it. The schema you are using (orders have many shipments going from point to point) are easily expressed in a relational schema and once defined would hardly ever need to change, if at all during the lifetime of the application. So what if I need to add a column here or there. It won't matter at all. Do you have more than 100 million documents in MongoDB? I guess you don't. Even if you do, relational has that covered too.> This doesn’t always have to be the case, though it significantly contributes to Mongodb’s fast writes.What you are saying is that I need to change MongoDB in order to make it safe. Relational database are safe out of the box, no change necessary.> We add a lot of Notes to each shipment [...] it doesn’t critically affect the business workflows of the application.Say what?You're fine with data, even notes, being lost? That is completely acceptable to you? I guess this is what shocks me most. You kids think it's normal to lose data, and consider storing a note to be optional or something. It baffles me. If a note is optional, why have it in the first place?> They do but since most of the stuff is memory mappedTranslates to: you need to have your data in RAM. This does not scale at all. It doesn't even begin to scale to the level where MySQL was. TEN years ago.> Here is a simplified snapshotWhat follows is a class that is 236 lines long. Two hundred and thirty-six lines long. Dear sir, if this is your simplified code I fear what your actual production code looks like. If you committed that to one of my repos we would have a very serious talk. Also you would do this exactly once during your career at my company.> I haven’t even touched upon the replication and sharding features that Mongodb offers which I will reserve for another post.Which every relational store also offers.> To summarise I feel Mongodb is awesomeWhy is it awesome? You have only shown me why it is horrible. I have seen nothing that is awesome. Optional data persistence, needs huge amounts of RAM, complex application level code to deal with reports, this is all stuff that you can do better, faster and more reliable with a relational solution.

评论 #6860644 未加载

评论 #6860765 未加载

larsmak超过 11 年前

About a year ago I implemented a sub-system relying heavily on MongoDB. The load is not that immense, a couple of hundred requests per minute. The dataset is large however, several hundred GB, spread over millions of documents. Also, updates happens in batch, during the night, while reads are happening all the time. I have not had to touch the system since it was put in production over a year ago - it just runs.MongoDB is a tool, understand it's strengths and limitations and it will serve you well. We achieves great performance for our use-case by correct schema-design / partitioning of data, and sane use of indexes - which are excellent in MongoDB. If you need to scale large, you need to store the data in such a way that it does not require much resources to fetch them, i.e. you must store the data according to your read-requirements. This is even more true in "key-value" systems like Cassandra, which is more limited in how you can store data. MongoDB is very flexible, so it's a lot easier to shoot yourself in the foot.

camus2超过 11 年前

Mongodb is not evil ,it was just sold as something it is not.10gen were good at marketing, but businesses require results, not marketing.MongoDB solves little problems.

smagch超过 11 年前

You may take a look at a "MongoDB Gotchas" discussion.<a href="https://news.ycombinator.com/item?id=4745067" rel="nofollow">https://news.ycombinator.com/item?id=4745067</a>

评论 #6861192 未加载

shaneofalltrad超过 11 年前

I have seen a lot of good use and success stories with MongoDB as well. This negative reaction lately makes me want to give it a shot and see why there is so much fear.

评论 #6859926 未加载

Pitarou超过 11 年前

Disappointing. I want to like MongoDB, but if this is the best a MongoDB apologist can come up with, I'll stick to SQL with a decent O-R mapping.

评论 #6860618 未加载

smegel超过 11 年前

> Mongodb expects that your working set fits into RAM along with the indexes for your database.I don't really get this. Surely any database system is going to be faster if all frequently updated/accessed pages fit into ram...what makes mongo special in this regard? Why does it degrade so badly when it has to access disk (beyond the obvious)?

评论 #6860724 未加载

评论 #6861704 未加载

nasalgoat超过 11 年前

After my experience with MongoDB at scale - I was running one of the largest MongoDB installations in the world according to 10gen - I have since run into the arms of postgres and wish to never repeat the horrors I experienced over there.I think the main issue is that people want something to be simple, and now that I'm dealing with the somewhat esoteric and opaquely documented postgres, I can understand that feeling. Native replication and auto-failover (via pgpool) is a bit of a black box on postgres, but under MongoDB it was fairly simple.The problem is, it's complicated for a reason, and that reason is scale. What took me over 100 masters in MongoDB will only need two postgres boxes, so it's worth dealing with the Oracle of Open Source to make it happen.

poseid超过 11 年前

anyone here has looked at the discussion on ArangoDB a bit up - <a href="https://news.ycombinator.com/item?id=6859767" rel="nofollow">https://news.ycombinator.com/item?id=6859767</a> - also a schemaless, but compressable document datastore