At some point I'll write up my notes about how I've been using MongoDB. I've basically given up on SQL databases. Every architecture, for every scale, from small to Enterprise, is really better handled by MongoDB, sometimes in conjunction with Kafka (since any sufficiently large operation is automatically heterogeneous and polyglot, with different database technologies).<p>When you're a small startup and you're just starting up, you can create a single MongoDB instance (ignore everything about you've heard about Web Scale) and stuff data into it as needed, without thinking much about the structure. You can add in contracts on your database functions, which slowly specify the contract, as you learn more about what your project is really about. To get a sense of that style of development, please see what I wrote in "How ignorant am I, and how do I formally specify that in my code?"<p><a href="http://www.smashcompany.com/technology/how-ignorant-am-i-and-how-do-i-specify-that" rel="nofollow">http://www.smashcompany.com/technology/how-ignorant-am-i-and...</a><p>MongoDB is great for ETL. You can pull JSON from 3rd party APIs and store it in its original form, then later transform it into the different forms you need.<p>In large Enterprises, you will inevitably be trying to get multiple services and databases to work together. The old style for dealing with this was the ESB (Enterprise Service Bus) or SOA (Service Oriented Architecture) but in recent years most of the big companies I've worked with have moved toward something like a unified log, as Jay Kreps wrote about in "The Log: What every software engineer should know about real-time data's unifying abstraction". If you haven't read that yet, go read it now:<p><a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying" rel="nofollow">https://engineering.linkedin.com/distributed-systems/log-wha...</a><p>In this context, MongoDB can offer a flexible cache for the most recent snapshot your service has built, based off of what it read from Kafka.<p>Some people are sabotaged by MongoDB, and they start treating canonical data as a cache. Obviously that leads to disaster. I believe this is what happened to Sarah Mei. Her experiences caused her to write "Why You Should Never Use MongoDB"<p><a href="http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/" rel="nofollow">http://www.sarahmei.com/blog/2013/11/11/why-you-should-never...</a><p>The one rule I would suggest is that you always need to be clear, in your own head, which collections are canonical and which are cache. When I talk to teams who are new to this, I tell them to use a naming convention, such as adding a "c_" to the start of every collection that is canonical. All other collections can be assumed to be caches. And the great thing is, it is very cheap to create caches. You can have 20 caches for the same data, in slightly different formats. You can have one cache where the JSON is optimized to what the Web front-end needs, and another cache where the JSON is optimized for the mobile app, and another cache where the JSON is optimized for an API for external partners. Just don't fall into the trap that Sarah Mei mentions, where you treat everything as a cache. You need to be clear in your head which data is canonical. If you are using Kafka the way Jay Kreps mentions, then the data in Kafka is canonical and everything in MongoDB is a cache. But at smaller operations, I've used MongoDB to hold both the canonical data and the caches, in different collections.