The recent launch of FathomDB is the latest of many articles on HN about databases, scalability, and the cloud.<p>What is the state of the art -- in terms of scalability, management, monitoring -- for people who need to keep their data in house? Maybe X where X:FathomDB as Xobni:Gmail? Or are we stuck hiring a DBA?
There is no easy out, if you really need to scale, regardless of infrastructure you are going to need someone who understands databases. If you're lucky you might be able to find someone who can do more than one job. e.g. dba+sys admin. Just make sure that they really do know both things.
If it helps, I asked a more concrete version of this question focusing on PostgreSQL on stackoverflow: <a href="http://stackoverflow.com/questions/556325/postgresql-management-and-monitoring" rel="nofollow">http://stackoverflow.com/questions/556325/postgresql-managem...</a>
The state of the art _is_ hiring a DBA. They get paid to be up on all the database monitoring, scaling, managing and management tools. This is why DB consultants charge $200+ an hour. It's pretty arcane knowledge, and for a business of any size, the database quickly becomes the lifeblood of the organization.<p>This is also the reason that databases like Couch DB or Amazon Simple DB or Google's App Engine are so appealing. They all have the promise of reducing db management headaches, at the cost of sacrificing features.
Try a commerical database, such as Microsoft SQL Server (FD: I work there as a developer). To bring up a recent example, it can build indexes without locking the whole table so you wouldn't need to jump throught the hoops like the FriendFeed guys just did with MySQL. Unless your time is free it will cost you less to buy SQL Server than to write your own "online index thingie" on top of MySQL.<p>And if your time <i>is</i> free, then you are probably a startup and you can get SQL Server and bunch of other MS tools for free via <a href="http://www.microsoft.com/BizSpark/" rel="nofollow">http://www.microsoft.com/BizSpark/</a><p>MySpace runs on SQL Server: <a href="http://www.sqlservercentral.com/blogs/steve_jones/archive/2009/02/18/myspace.aspx" rel="nofollow">http://www.sqlservercentral.com/blogs/steve_jones/archive/20...</a> and if it's good enough for them it will likely be good enough for your site as well. I'm not sure if the scale-out story is any good (there are "partitioned views" which are basically "sharding" but I never used those), but scale-up story is as good as they come - it can efficiently utilize 64-cores of the HP superdome.<p>I don't know much about Oracle and IBM DB2, but I know that the latter has a free version without too many restrictions on it, so give them a spin as well. If ease of use is importnat to you, common wisdom has it that Microsoft's is the easiest to use among the top commercial vendors. NB: I did not verify the common wisdom, YMMV :-)