He had RAID and was doing filesystem level backup, ie. copying over the entire Mysql DB file. When filesystem-level corruption occured, the backup script overwrote a good (perhaps 1 day old) backup file with a corrupted file, so he's backup was worthless.<p>The first thing that comes to mind is that he could have used application-level backup, ie. Mysql. The script would have noticed that the DB is corrupted because reads (SELECT) would have failed, and the backup script would have stopped and sent him an email to restore the good backup file.<p>If he used a cloud service like Amazon SimpleDB, he wouldn't have to worry about filesystem-level corruption, because that's abstracted away by Amazon. (And it's replicated.)<p>This is still not enough though. What if the site gets hacked and the hacker issues DELETE statements. Then all your data is deleted, and even if you have application-level backup, it will succeed (it will read the empty DB), thus overwriting your old backup.<p>I guess the conclusion is to keep around several copies of the data, and have sanity-checks in place to avoid overwriting good backups. In his case it was hard (given it's a homegrown application) to keep around many copies, because his DB was 500G in size.