One thing that could likely get you fired rather quickly is running analytics on your live transactional system. Yes, your business needs to make decisions based on data, this is not terribly new. To think that you only have one data store is a bit short-sighted.<p>Many businesses (including startups) have moved to using document stores for high read environments and scraping nightly drops to their backend analytics systems. This is smart - you don't want to run summing/aggregation on a live transactional system for (hopefully) obvious reasons.<p>EDIT: it's also worth noting that map/reduce is typically much more powerful when aggregating large datasets. When trying to run analytics on top of a transactional system, developers like Ray here would end up with multiple joins and groupings - all of which slow <i>everything</i> down. Map/reduce certainly isn't perfect, but the author dismisses it as difficult witchcraft when, in practice, parallel execution of MR queries can greatly decrease resources and time to information.<p>I sort of think we've moved beyond this discussion.
The unhappy truth is that for many startups, relational integrity and transaction safety are simply not very valuable. Customers of an early-stage startup are by definition willing to take a risk on whatever they're getting from that startup. So simply <i>not thinking</i> about these problems - accepting that occasionally a partial write will happen, or two writes will collide, or a migration will not quite work correctly and your pages will crash until it's fixed - is a worthwhile sacrifice to increase development speed.
I'm totally with you. We've experienced to have Object database or XML database and NoSQL database. Now we understand that relational database is just the right way to go for web applications, because it deals with structured data so well, keeps querying and sorting, filtering seamlessly and effortlessly. It's a must.<p>It is the same thing for choosing Linux distribution and JDK mode. See the references here:<p><a href="http://bingobo.info/blog/table-of-contents.jsp" rel="nofollow">http://bingobo.info/blog/table-of-contents.jsp</a><p>BTW, your title should have "relational" instead of "relation".
Serious question: what are NoSQL databases really good for? I'm only really used to relational DBs, and I'm unclear about which problems a NoSQL database is useful for.
People too often forget about graph databases when talking about NoSQL solutions. Graph databases offer an interesting and elegant alternative to relational databases and I could definitely see a startup decide to use this kind of technology.<p>As far as I know, most graph databases support transactions and offer great scalability. Such databases are also schema-less and can be queried with Gremlin, a powerful graph traversal language (see www.tinkerpop.com).<p>With respect to scalability and transactions, Titan (<a href="http://thinkaurelius.com/" rel="nofollow">http://thinkaurelius.com/</a>) looks very promising: it supports various backends for storage (Cassandra, HBase, etc.) and indexing (currently Elastic Search and Lucene). Graph analytics can be done via Faunus (<a href="http://thinkaurelius.github.io/faunus/" rel="nofollow">http://thinkaurelius.github.io/faunus/</a>), backed by Hadoop.<p>There are other vendors out there (Neo4J, OrientDB, etc.) which offer interesting solutions worth looking at - I'm just a bit less familiar with them.<p>The major downside I see with graph databases is that most of them are fairly recent and their ecosystem is tiny (though growing). Should a startup venture on such young technologies, or stick to mature and battle-tested solutions (ie. relational databases)?<p>Could startups use this kind of graph "NoSQL" databases? I don't see why not. If your startup is some kind of social network, graph databases are certainly an option worth considering. If I were to create a startup, I'd hardly use a document database like MongoDB but I will really consider using a graph database. In the end, it's all about having the right tool in hand, and knowing how to assert what is "right" for you.
There can only be one DB much like the LOTR can only have one ring. Why? Thats the only area the linked article falls down on. Its a pretty good article other than that.<p>So you properly normalized your entire system, customer billing transaction records all the way up to article tags. Then article tags gets too huge. So next version looks at RDBMS and Redis, and the next version after that only looks at Redis. Customer billing transactions remains on a "real" DB and the tag cloud lives on redis. And the problem with that is... what exactly?<p>Its obsolete thinking. I can't have two databases because we're a poor startup and the only databases that exist are DB2 and Oracle and everyone knows they're super expensive so super expensive times two is unaffordable. Dude, its almost 2014 not 1980, Postgres/mysql/redis its all free.
This comes up every now and then on HN. There are plenty of NoSQL horror stories.<p>Thing is. Most SQL database at scale is a bit of a horror too. Have you seen real-life production relational databases? Gawd. Hacks on hacks. Then you add another database. And another analytics database. And a bunch of point to point data feeds. Argh.<p>But hey. That's data.<p>If you think choosing SQL will solve your analytics woes down the line -- it's just not true. You're in for some pain no matter what you do.<p>... That's unless you get a porcelain schema first time. Which, if you're in a startup, probably means you're working on the wrong problem.<p>That's not an argument <i>for</i> using NoSQL (I used MongoDB daily, but I've got plenty of love for PostgreSQL). It's a rebuttal that SQL magically solves a different problem.
Strongly disagree with the article as simplification always looks shiny. Start-Ups should sit back for a few hour and days and invest the work to answer some serious questions as these <a href="http://nosql-database.org/select-the-right-database.html" rel="nofollow">http://nosql-database.org/select-the-right-database.html</a> (there are other cataloges like this one).<p>Then you get a little closer to the truth.
I often read that argument that NoSQL Databases are Schemaless and yes the Database is but your Data is or it isn't. YOU must know your Data.<p>"All the while moving work onto the developers to standardize how they handle different migration cases."<p>I know a startup is fast and bla bla...
BUT your team should know the tools that you are using...
For me SQL DB's force me to add a new field and some kind of value and i don't like to be forced to a solution.<p>"In document stores, you have two choices: store related data as sub-documents, or store related data as separate documents with references. It is up to the developers to understand the trade-offs of both approaches. Selecting one over the other can lead to performance gains or issues, scalability issues and above all, make asking certain questions of the data a lot harder."<p>Again know the tools you are using.
And for example MongoDB has good ORM's too.<p>"But that takes much more forethought and is dependent on a particular problem."<p>If your startup is doing something new and shiny you don't have the knowledge and forethought and you often dont know what particular problem will come at you.<p>Most of the point's look like:
You learned at your University SQL now you know it(but in really life you don't) and now use it because you know how to normalize a Database. This argumentation is often used to say why java is so great or why javascript is bad.<p>I personally started with php then moved to rails and now to meteor(uses MongoDB) and we never before meteor could make so fast a good prototype which for a startup is very important.<p>So yeah if you are comfy with SQL use it if your comfy with NoSQL use it.
Don't they? For some tasks relational databases are good, for some they are worse. Call me captain.<p>However relational databases will have hard time with big data because your dataset is bigger than your database and you have no relational integrity.
Depends what your startup is doing. If you are only using your database to store some basic transactions, then a relational database is a very good fit. This is really the case for most startups tackling common problems. However, if your startup is tackling a problem with unique technical challenges, then you can't just ignore the issue. For example, a geo-location startup tracking the location in real time of users with a free app is simply not going to be able to use a relational database.
I don't know. It's a hard question. MongoDB is pretty much used in any hackathons simply because it's easy to setup, driver support is good, and schemaless. The last one is really why people use MongoDB over SQL DBMS. For startup, there might be a concern that schema migration is tough.<p>But one can argue that not careful with schema design can break api and make codebase messy.<p>I guess I will stick with the hard work now... I guess not careful with schema will definitely bite me.
He forgot one of the very important reasons to use (some) NoSQL databases: high availability. Relational database systems are very poor at providing that. Most often the availability options are limited to resistance to node failures. RDBMSes have several SPOFs and must use failover which is not dependable, hard to test, and in many times needs manual intervention. Forget resistance to network partitions.