Most people that don't have extended experience with large-scale data stores do not understand basic principle: redundancy decreases probability of data loss, but it never eliminates it completely. All massive data stores slowly bleed data, it's just they bleed it so slow that it's acceptable for most scenarios. In case of this specific example, once number of users is large enough, there always be somebody who lost their volume.<p>To illustrate this: think about a-la-GFS randomly triplicating data store on 1000 nodes. Once enough data is put in (lets say 100M blobs), there always be blob unique to any given triplet. <i>In other words simultaneous loss of any 3 nodes out of 1000 will always result in data loss</i>. (Simultaneous is in the sense "faster than time to detect failure and recover"). Of course failures are not limited to node loss, but there is corruption in transit, hard drive loss, bad sectors, rack-level failures. As the volume of the data and number of nodes grows it all adds up, so even if for each particular blob mean time to data loss is astronomically high, probability to loose some blob on any given day is very real.