This paper [1] analyzes the Backblaze open source Hard Drive reliability data set [2].<p>In this paper, Researchers at IBM Research Zurich analyze Backblaze's raw data. They used machine learning to formulate drive replacement rules with confidence intervals [1,3].<p>In addition to the factors identified by Backblaze, they identify certain additional Smart stats that enhance the predictive capability [1, Table 6].<p>For Hitachi drives they factor in average time of spindle spin up (Smart raw 3). For Seagate they factor in count of aborted operations due to HDD timeout (Smart raw
188).<p>[1] Botezatu et al, KDD 2016, Predicting Disk Replacement towards Reliable Data Centers, <a href="http://www.kdd.org/kdd2016/papers/files/adf0849-botezatuA.pdf" rel="nofollow">http://www.kdd.org/kdd2016/papers/files/adf0849-botezatuA.pd...</a><p>[2] What SMART Stats Tell Us About Hard Drives, <a href="https://www.backblaze.com/blog/what-smart-stats-indicate-hard-drive-failures/" rel="nofollow">https://www.backblaze.com/blog/what-smart-stats-indicate-har...</a><p>[3] Predicting disk failures for reliable clouds, IBM Research Blog, <a href="https://www.ibm.com/blogs/research/2016/08/predicting-disk-failures-reliable-clouds/" rel="nofollow">https://www.ibm.com/blogs/research/2016/08/predicting-disk-f...</a>