TechEcho

5 comments

23davidover 12 years ago

If you guys are writing post-mortem blog posts due to running out of disk space, your solution should really be to hire a sysadmin or ops-focused engineer. Disk-related issues are among the easiest to diagnose, so take this experience as a wake-up call that your current team is in over your heads. If you can't afford a sysadmin or don't want to bring that kind of talent in-house, you can try using a hosted solution. But make sure to really test out different hosted services before committing to one, since they can vary tremendously in terms of quality and reliability.

tedchsover 12 years ago

I have used Monit for years for basic server monitoring. It's a very tiny daemon with a single, simple config file. Basically I can tell it "when disk space exceeds X, or RAM exceeds Y, or CPU exceeds Z, or process identified by pidfile foo.pid isn't running, or I can't ping something, email me". No monitoring servers, no network polling, no SNMP, no monthly fees. Sounds like five lines of Monit config would have saved these guys. See the config file docs at <a href="http://mmonit.com/monit/documentation/monit.html" rel="nofollow">http://mmonit.com/monit/documentation/monit.html</a> .

bsg75over 12 years ago

This issue is oddly similar to issues seen at a prior gig, where MSSQL and MySQL transaction logs (replication bin logs for MySQL), consumed excess disk space when large operations did fully replicate (for various reasons), and the log volume filled.<p>Monitoring helps, but unless your Ops staff knows what to do with a misbehaving database (RDBMS or other), it falls on the DBA or equivalent.

jtreminioover 12 years ago

I'm no server admin, but it seems to be a recurring theme where big issues are narrowed down to disk space running out. Is there not something that can automatically check this and send out alerts?

评论 #4609030 未加载

评论 #4608656 未加载

评论 #4608949 未加载

评论 #4608855 未加载

评论 #4608697 未加载

评论 #4609213 未加载

matthewowenover 12 years ago

Reading that site is like the way I imagine having cataracts must be.<p>Please, more contrast between text and background. It's like reading through a haze.

5 comments

23davidover 12 years ago

tedchsover 12 years ago

bsg75over 12 years ago

jtreminioover 12 years ago

I'm no server admin, but it seems to be a recurring theme where big issues are narrowed down to disk space running out. Is there not something that can automatically check this and send out alerts?

评论 #4609030 未加载

评论 #4608656 未加载

评论 #4608949 未加载

评论 #4608855 未加载

评论 #4608697 未加载

评论 #4609213 未加载

matthewowenover 12 years ago

Reading that site is like the way I imagine having cataracts must be.<p>Please, more contrast between text and background. It's like reading through a haze.

Lessons Learned from a Redis Outage at Yipit

5 comments

Lessons Learned from a Redis Outage at Yipit

5 comments