TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Predicting Disk Replacement Towards Reliable Data Centers [pdf]

43 pointsby zxvover 8 years ago

3 comments

zxvover 8 years ago
This paper [1] analyzes the Backblaze open source Hard Drive reliability data set [2].<p>In this paper, Researchers at IBM Research Zurich analyze Backblaze&#x27;s raw data. They used machine learning to formulate drive replacement rules with confidence intervals [1,3].<p>In addition to the factors identified by Backblaze, they identify certain additional Smart stats that enhance the predictive capability [1, Table 6].<p>For Hitachi drives they factor in average time of spindle spin up (Smart raw 3). For Seagate they factor in count of aborted operations due to HDD timeout (Smart raw 188).<p>[1] Botezatu et al, KDD 2016, Predicting Disk Replacement towards Reliable Data Centers, <a href="http:&#x2F;&#x2F;www.kdd.org&#x2F;kdd2016&#x2F;papers&#x2F;files&#x2F;adf0849-botezatuA.pdf" rel="nofollow">http:&#x2F;&#x2F;www.kdd.org&#x2F;kdd2016&#x2F;papers&#x2F;files&#x2F;adf0849-botezatuA.pd...</a><p>[2] What SMART Stats Tell Us About Hard Drives, <a href="https:&#x2F;&#x2F;www.backblaze.com&#x2F;blog&#x2F;what-smart-stats-indicate-hard-drive-failures&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.backblaze.com&#x2F;blog&#x2F;what-smart-stats-indicate-har...</a><p>[3] Predicting disk failures for reliable clouds, IBM Research Blog, <a href="https:&#x2F;&#x2F;www.ibm.com&#x2F;blogs&#x2F;research&#x2F;2016&#x2F;08&#x2F;predicting-disk-failures-reliable-clouds&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.ibm.com&#x2F;blogs&#x2F;research&#x2F;2016&#x2F;08&#x2F;predicting-disk-f...</a>
评论 #12678427 未加载
hbogertover 8 years ago
Eager to know if their model would still be accurate when new&#x2F;future hard disk models are used, i.e., would their model need to be retrained every time. If so, you would first have to wait until the new disks are old enough to show signs of wear and tear, before the machine learning techniques can say something meaningful. So in practice I&#x27;m afraid this approach gives you only a pretty high confidence intervals for for last generation&#x27;s harddisks. But companies like Backblaze probably buy the relatively newest type of hardware every new &quot;hardware-season&quot;.
评论 #12677797 未加载
评论 #12677209 未加载
mkjover 8 years ago
So who&#x27;s going to make us all a nice fork of smartmontools that does all the prediction?
评论 #12677865 未加载
评论 #12677695 未加载