Hey guys,<p>I am a system administrator and was given the incredible well documented task to support "Big Data". In this particular case, the idea is to get an impression of the user behaviour on a quite big Swiss website by the means of aggregating all log data and querying it later for analysis.<p>Our Java coders are mostly settled to "Let's do cassandra!" while Riak also looks interesting. Having to support "schema changes" and running it in a mission critical environment (i.e. 0% downtime) are some requirements.<p>We already have mongodb running, but mongodb's sharding makes life hard when you are a sysadmin and think about backups.<p>So my question to you is:<p>Given the two alternatives, which one would you choose if it was you to run the infrastructure?
I have to agree with the other comments that you seem to have prematurely narrowed your options. The requirements that you mention are pretty broad and don't really pick out Riak or Cassandra in particular.<p>You may want to ask, What are your requirements for data consistency? Will you need to build reliable abstractions on top of your data store? If so, you may want to look at other options, such as FoundationDB.<p>Stephen Pimentel, foundationdb.com
Why not ask yourself the question "Why Cassandra?" or "Why Riak?" or "why you are specifically looking at these two alternatives"
for the task at hand.
This way you know what each of these could do for your situation. Once you have established that, it will be easier to look at other factors.