When I read posts like this all but confirming MongoDB isn't the great product 10Gen make it out to be, I wonder how the heck MongoDB are still even relevant and then I remind myself of the fact that 10Gen have one of the best marketing and sales teams in the game at the moment.<p>While MongoDB has improved greatly over previous versions, I can't help but feel if 10Gen put as much effort into improving their product as they do selling it, Mongo would be a force to be reckoned with!<p>MongoDB is good at some things, but I think most people that try and fail with it fall into one of two camps: 10Gen sold them into it or they bought into the hype without assessing project requirements and ensuring MongoDB was a sensible choice.
Viber, one of the largest over the top messaging apps, recently shared their conversion from Mongo to Couchbase. They ended up requiring less than half of the original servers, and better performance.<p>If you want to see a video of their engineer telling the story, it's available here: <a href="http://www.couchbase.com/presentations/couchbase-tlv-2014-couchbase-at-viber" rel="nofollow">http://www.couchbase.com/presentations/couchbase-tlv-2014-co...</a>
This article caught my interest as I've been reading into Cassandra. But some previous research had me thinking that Cassandra works best with under a TB/node. Is SQL still better when you have really large nodes (16-32TB) and only really want to scale out for more storage?<p>I'm currently humming along happily with Postgres, but some of the distributed features, and availability of Cassandra look really nice.
Sounds the intended use case for ElasticSearch.<p>"Given some input piece of data {email, phone, Twitter, Facebook}, find other data related to that query and produce a merged document of that data"
"To buy us time, we ‘sharded’ our MongoDB cluster. At the application layer. We had two MongoDB clusters of hi1.4xlarges, sent all new writes to the new cluster, and read from both..."<p>I'm curious about this. Why were you doing the sharding manually in your application layer? Picking a MongoDB shard key - something like the id of the user record - would produce some fairly consistent write-load distribution across clusters. Regardless - it seems like write-load was a problem for you, yet you sent all the write load to the new cluster - why not split it?
Yet another shining example of throwing money and time away to work within AWS constraints when bare metal and openstack (1) would have solved it cheaper (2) and arguably faster.<p>1 (if you insist on cloud provisioning instances, even though it makes little sense if the resources are as strictly dedicated as they are in this case)<p>2 (VASTLY, over time -- these guys are pissing money away at AWS and I hope their investors know it)