How tiring. We can litter the internet with posts like this, but it would be a <i>lot</i> more useful to post reasoned, factual and detailed posts when discussing the merits or pitfalls of a given technology.<p>I've launched very high traffic websites using MongoDB, where it was the least of my worries. I've also launched very high traffic websites using MySQL, where it was my primary source of pain.<p>Shall I now run around screaming loudly about how badly MySQL sucks? On the surface you might assume so, but with reasonable effort to consider all the facts, it becomes clear that there's more to this than just "this database is better than that one because I hate it."<p>This post is more of an angry payback rant from someone who got banned from IRC for being abusive to newcomers and greenies. On top of that, it is factually incorrect in some spots, and misleading in others. Not impressed.<p>That said, NoSQL databases are evolving extremely quickly, and have reached a point of maturity that they can excel in the right scenario. Just like relational databases, they can provide big benefits, given that the user takes the time to learn how to use them, as well as making sure they have a use case that merits the strengths of their chosen tech.<p>I really enjoy working with MongoDB and Redis, and have been a community cheerleader of sorts for PostgreSQL for years. It <i>is</i> possible to have multiple tools in your toolbox, and use them when appropriate. No tool is perfect, and no tool is perfect for every job.<p>Complaining bitterly that your shiny new piano makes a lousy boat, on the other hand, simply adds to the noise.
It seems like ranters against MongoDB often don't really understand how it works and thus what it is good for.<p>My simple mental model for MongoDB is: indices (should!) fit in memory and documents are stored contiguously on disk. That is it in a nutshell. A query involves in-memory lookups and maybe only one disk seek. Writes in place are usually possible.<p>I have been happiest with MongoDB in two different scenarios:<p>The first is in developing small web applications where there is no scaling issue, and the fact is that MongoDB is so easy to develop against and simply provides a great developer experience. As needed, I use a cron job to do a mongodump a few times a day; or, if I really need high availability (which, frankly, often I don't: if a system is unavailable one or twice a year for an hour it is no big deal) then replica sets are OK.<p>The second scenario where I have really liked using MongoDB was doing analytics on a modestly large stream of social media data. A single Mongo master on a large EC2 instance was adequate to handle writes and slaves on other large EC2 instances each fed a different analytics application. This setup of apps reading from a slave on the same server worked really well for me. This was a low hassle experience.<p>I do have one customer with really large MongoDB setups on multiple data centers, and I am working around right now on some hassles, but we haven't found anything else as cost effective for the customer's applications.<p>All that said, when I can use it, just using a single (no horizontal scaling) PostgreSQL server is for me the most hassle free developer experience, but I have always used PostgreSQL for small or medium sized applications - nothing that needed to scale.
I feel like 1 - 2 years ago I was reading a slew of blog posts with the title "Why We Chose MongoDB." Now it seems like all of the blog posts are some sort of "We Just Finished Migrating off of MongoDB, Here's Why."<p>I know nothing about MongoDB and have never tried it. But the message seems pretty clear.
While some of the author's criticisms are valid, some of them are completely wrong:<p>> Having no option to perform an operation comparable to UPDATE table SET foo=bar WHERE....<p>What? db.collection.update does exactly this. See: <a href="http://www.mongodb.org/display/DOCS/Updating#Updating-update%28%29" rel="nofollow">http://www.mongodb.org/display/DOCS/Updating#Updating-update...</a><p>MongoDB fit a nice niche for a read heavy mid-scalability db solution. Every DB has it's niche. Trying to use it outside of what it's good for is going to get you burned. If people just did their research before blindly committing to a platform, we'd see a lot less posts like this.
> Leaving memory management to the operating is nice idea - in reality it does not scale and does not play very well.<p>This is why I think that Linus's tirade against O_DIRECT is misguided: <a href="https://lkml.org/lkml/2007/1/10/233" rel="nofollow">https://lkml.org/lkml/2007/1/10/233</a><p>Here's the thing: the kernel is a library. It took me a long time to fully understand this deep idea. The kernel is just a library that has a different and more expensive calling convention (syscalls) and runs at a higher privilege level.<p>It's also much less flexible than user-space libraries. Its interface is an unholy mix of syscalls, ioctl(), /proc, vdso, etc. There is a high bar to adding new interfaces. Removing or changing existing interfaces is basically not allowed.<p>The resources that the kernel uses are much harder to account for or predict. How can you ensure that a process always gets at least X MB of page cache, and that some enormous "cp" that some sysadmin is running won't evict all your MongoDB pages that are caching your database? Sure you could mlock() your pages, but now you're basically side-stepping all of this smart kernel cache management that was supposed to be helping you so much in the first place.<p>User-space management of buffers and caches is more flexible, easier to account to its owner, and more predictable. It can't handle page faults with Linux's current interfaces, but the L4 guys have figured out how to let pagers run in user-space and handle page faults. I hope that someday this work becomes mainstream. Our 20-year-old OS design is showing its age.
<i>There is no single way to control the memory usage using system tools except maintaining mongod instances on dedicated virtual machines without running further services. There are numerous complaints from people about this stupid architectural decision from various side and 10gen is doing nothing to change this brain-dead memory model.</i><p>Can someone explain to me why this is actually a big issue? Except for really tiny apps, I imagine that having dedicated VMs for your MongoDB actually would be perfectly fine? Probably even preferred?
Hell yeah.<p>From what I've seen of MongoDB I'm not impressed at all. In some carefully controlled cases, performance would be acceptable, but change anything at all (even the order that data is inserted) and it just sucks.<p>For one particular application, the performance difference between MySQL and Mongo was like the difference between the Space Shuttle and a Chevy Sonic.
"My essage to companies building applications on top of MongoDB: assigned smart people to MongoDB and don't leave the database work to people that can hardly spell their name or that can just count to three. Yes, this paragraph is harsh and does not comply with diversity but it is true and reality. The number of people that should not do any database related work, people without reasonable background, people lacking basic skills in understanding databases is extraordinary high."<p>Isn't that true for <i>any</i> database? What point are you trying to make? That a large MySQL deployment can be flawlessly be maintained by people that can "hardly spell their name"?
This is kind of a strange list of complaints.<p>MongoDB memory management is a legitimate concern... but not because it's hard to control memory usage of a single mongod.<p>"More granular locking" is a temporary, non-scalable solution?<p>I've run out of energy, actually, but really?
This reads more like a rant then an actual discussion of problems the company was having with MongoDB. There is a place for valid criticism, but this is the polar opposite. I am actually more interested in the fact that this made it to the front page so fast tehn the actual content of the article. Are there so many people upset with Mongo that even something as poorly done as this rant can get publicity ?
But.. but.. MongoDB is web-scale: <a href="http://www.youtube.com/watch?v=b2F-DItXtZs" rel="nofollow">http://www.youtube.com/watch?v=b2F-DItXtZs</a><p>(Note how the video raises some of the same concerns as the blog post)
I'm amazed at all these (excuse me) idiotic articles. People/projects have different requirements, so there are many databases around(relational and nosql and key/value). Just because your needs do not match MongoDB's (or MySQL's or ...), does not mean the technology is useless.
The biggest problem with mongoDB IMO is that BSON dictionaries are ordered. Let that sink in for a sec: the hash data structure must be ordered....
The solution most drivers run with is to just alphabetically order each dictionary.... a ineffcientcy I'm not really happy with.
This is completely unrelated to the subject, and I have not used mongodb, have no idea if it's good or bad, and I don't even know how to spell it, but isn't it ironic that this post shows up on the day when the top HN post's title is "Please learn to write"?<p>As I said, no idea how good or bad mongo is, but I'm guessing, if you are as sloppy in your code as you are in your English, I'll be happy to give mongo the benefit of the doubt...
Its unfortunate that right now none of the 3 major document stores seem to be doing all that well or are easy to use straight out of the box. I use and like mongodb but only for prototyping. I havent decided what to go with longer term if my projects have a need. Couchdb is interesting but seems to be going through some serious growing pains right now with the couchbase product being very confusing to figure out and use. Riak is also interesting but it seems more specialty then a general purpose tool.<p>Kind of a bummer.
I keep reading about how mongo's use of memory mapped files is real bad. Isn't that the same technique used by varnish cache and that's what makes it awesome? Can someone explain please?
Blame yourself before blaming MongoDB. If you've been around the software industry you should always be mindful of fads and vaporwares. When you make a decision to use MongoDB you better have done your homework first or do some testing yourself. Given the low cost of renting bunch of EC2 machines for a few hours, it's idiotic to build a business around MongoDB or any other system that has not been fully proven without doing bunch of stress testing yourself. Yes, and don't trust software vendors, get independent advice or test it yourself.
A good read. I'm working on a relatively large project now, written in Node (whee), and I was considering going full-koolaid with Mongo. I think I may stick to MySQL.
<i>using JSON as a query language was a bad decision. The current JSON query language works for standard queries but the functionality of the operators is limited.</i><p>These two things don't go hand in hand. JSON <i>could</i> be used to elegantly represent complex queries. A problem with the query system isn't necessarily a problem with JSON.
We started out with just MySQL. Then added MongoDB + replicasets. Then added Cassandra. And now we just finished adding Elastic Search. All of this for the same Web Application. Use the right tool for the job. The pattern i've noticed is that indeed we started migrating DATA out of MongoDB, mostly to Cassandra.
tl;dr We LOVED MongoDB (<a href="http://www.zopyx.de/blog/plone-using-highcharts-and-jqgrid" rel="nofollow">http://www.zopyx.de/blog/plone-using-highcharts-and-jqgrid</a>) but we got burnt so it's USELESS and BRAINDEAD!
mmap files and sharding...<p>It seems like the problem is that you're not using MongoDB in a sharded setup to begin with. For good or bad, MongoDB targets the scale where you need sharded and replicated setups. In other words, a large enough operation to require multiple servers for data storage. If you need the opposite of that, which is multitenancy, MongoDB is not going to be a good fit.<p>On the other hand, MongoDB has always been sold as a rapid prototyping and easy to iterate datastore, which is attractive for people working on small projects. Then they have an "oh shit" moment when they run into operational issues.