TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Heads up if you rely on NewRelic disk usage alerts

1 点作者 ende42大约 12 年前
We just learned the hard way that the NewRelic disk usage alert will never get triggered if your user level process fills up the disk until "no space left on device". Unless you change the defaults of either the alert threshold or your ext4 filesystem that is.<p>Issuing df on a "completely filled" disk will give you this:<p>$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 9611492 9123252 0 100% /<p>Notice the difference between "Used" and "Available"? That's the space reserved for root processes, so that careless users can't crash their system just by filling up the disk. See stackexchange for further discussion (http://unix.stackexchange.com/questions/7950/reserved-space-for-root-on-a-filesystem-why)<p>While df does Used/(Used+Available) to calculate the disk usage percentage, NewRelic does Used/1K-blocks.<p>This means when our disk got filled by a rogue process which wrote a huge logfile, the NewRelic disk usage measurement got stuck at 94,9%. As the NewRelic default threshold for disk usage alerts is 95% the alert never got triggered, the alert email never got sent and we had a service outage because the streaming server process crashed when it couldn't write to the disk.<p>End of story (to put it in NewRelic support staff words): "[…] you should really set to threshold under 95%, or tune your filesystem so that you have &#60;5% reserved […]". For them the current state is intended behaviour. So heads up if you rely on NewRelic disk usage alerts!

暂无评论

暂无评论