TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

HPE Drive fail at 32,768 hours without firmware update

229 pointsby abarringerover 5 years ago

16 comments

jzwinckover 5 years ago
Those who forget history are doomed to repeat it. Just seven years ago Crucial sold tens of thousands of their &quot;M4&quot; SSDs with a firmware bug that made them fail after 5184 hours: <a href="https:&#x2F;&#x2F;www.anandtech.com&#x2F;show&#x2F;5424&#x2F;crucial-provides-a-firmware-update-for-m4-to-fix-the-bsod-issue" rel="nofollow">https:&#x2F;&#x2F;www.anandtech.com&#x2F;show&#x2F;5424&#x2F;crucial-provides-a-firmw...</a><p>Do they still not test these things with artificially incremented counters?
评论 #21638513 未加载
评论 #21638871 未加载
评论 #21639969 未加载
评论 #21638319 未加载
userbinatorover 5 years ago
According to this page, the SMART hour counter is only 16 bits, and rollover should be harmless:<p><a href="http:&#x2F;&#x2F;www.stbsuite.com&#x2F;support&#x2F;virtual-training-center&#x2F;power-on-hours-rollover" rel="nofollow">http:&#x2F;&#x2F;www.stbsuite.com&#x2F;support&#x2F;virtual-training-center&#x2F;powe...</a><p>If you look elsewhere on the Internet, you&#x27;ll find people with very old and working HDDs that have rolled over, so I suspect this bug is limited to a small number of drives.<p>(What that page says about not being able to reset it is... not true.)<p>Likewise, I&#x27;m skeptical of &quot;neither the SSD nor the data can be recovered&quot; --- they just want you to buy a new one.<p>Tangentially related, I wonder how many modern cars will stop working once the odometer rolls over.
评论 #21637951 未加载
评论 #21638134 未加载
评论 #21641142 未加载
abarringerover 5 years ago
Since most drives are started and used concurrently this bug would blow any RAID set up. There&#x27;s a dark day coming for some sysadmins.
评论 #21637884 未加载
评论 #21637874 未加载
评论 #21638912 未加载
评论 #21638038 未加载
verytrivialover 5 years ago
&gt; HPE was notified by a Solid State Drive (SSD) manufacturer [...]<p>That&#x27;s a curious bit of context. It seems to imply they&#x27;re shifting some of the blame onto their manufacturer? I makes me wonder if this firmware is 100% HPE specific, or if there a 2^16 hours bug about to bite a bunch of other pipelines.
评论 #21638060 未加载
评论 #21637944 未加载
评论 #21639818 未加载
评论 #21637980 未加载
voiper1over 5 years ago
&gt;The issue affects SSDs with an HPE firmware version prior to HPD8 that results in SSD failure at 32,768 hours of operation (i.e., 3 years, 270 days 8 hours). After the SSD failure occurs, neither the SSD nor the data can be recovered. In addition, SSDs which were put into service at the same time will likely fail nearly simultaneously.<p>Looks like some sort of run time stored in a signed 2 byte integer. Oops.
评论 #21637750 未加载
评论 #21637753 未加载
pabs3over 5 years ago
Would be nice if the standard firmware update mechanism on Linux (fwupd&#x2F;LVFS) could be used for HPE products.<p><a href="https:&#x2F;&#x2F;fwupd.org&#x2F;lvfs&#x2F;vendors&#x2F;" rel="nofollow">https:&#x2F;&#x2F;fwupd.org&#x2F;lvfs&#x2F;vendors&#x2F;</a> <a href="https:&#x2F;&#x2F;fwupd.org&#x2F;lvfs&#x2F;devices&#x2F;" rel="nofollow">https:&#x2F;&#x2F;fwupd.org&#x2F;lvfs&#x2F;devices&#x2F;</a>
评论 #21639320 未加载
zozbot234over 5 years ago
Ouch. I wonder how many non-enterprise SSD&#x27;s come with similar bugs, <i>and</i> zero support by the firmware vendor.
评论 #21637726 未加载
评论 #21637863 未加载
_bxg1over 5 years ago
Whatever the counter is, the fact that it&#x27;s 32,768 instead of 65,536 suggests they used a <i>signed int</i> for something that presumably starts at zero and increases monotonically... Avoiding just that mistake would&#x27;ve given them twice as much time - nearly 7.5 years - which seems like it&#x27;d be longer than these drives would typically last anyway.
评论 #21647597 未加载
评论 #21640860 未加载
pjc50over 5 years ago
Amazing. A repeat of the &quot;Windows 95 crashes after 48 days uptime&quot; and other timer rollover bugs.
评论 #21638125 未加载
S_A_Pover 5 years ago
I just want to know how many of these failed at 32768 hours before they had their oh sh*t moment.
评论 #21639482 未加载
gruezover 5 years ago
&gt;By disregarding this notification and not performing the recommended resolution, the customer accepts the risk of incurring future related errors.<p>How is this work legally? For one, how would HPE prove that the customer read the bulletin? I don&#x27;t imagine they&#x27;re sending these out via certified mail.
bobowzkiover 5 years ago
At the hospital where I work, almost all HP desktops crashed within a few months...
评论 #21638941 未加载
iveqyover 5 years ago
Probably related to <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=21471997" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=21471997</a>
EvanAndersonover 5 years ago
I did some recon on eBay looking for used units w&#x2F; the affected SKUs for sale and they appear to be Samsung units.
annoyingnoobover 5 years ago
Whew, dodged that bullet, looks like I&#x27;m not using any of the affected drives. Lucky me, for now.
paggleover 5 years ago
Yikes! This is why when I built my home NAS I used five different drives and manufacturers.