TechEcho

16 comments

jzwinckover 5 years ago

Those who forget history are doomed to repeat it. Just seven years ago Crucial sold tens of thousands of their "M4" SSDs with a firmware bug that made them fail after 5184 hours: <a href="https://www.anandtech.com/show/5424/crucial-provides-a-firmware-update-for-m4-to-fix-the-bsod-issue" rel="nofollow">https://www.anandtech.com/show/5424/crucial-provides-a-firmw...</a>Do they still not test these things with artificially incremented counters?

评论 #21638513 未加载

评论 #21638871 未加载

评论 #21639969 未加载

评论 #21638319 未加载

userbinatorover 5 years ago

According to this page, the SMART hour counter is only 16 bits, and rollover should be harmless:<a href="http://www.stbsuite.com/support/virtual-training-center/power-on-hours-rollover" rel="nofollow">http://www.stbsuite.com/support/virtual-training-center/powe...</a>If you look elsewhere on the Internet, you'll find people with very old and working HDDs that have rolled over, so I suspect this bug is limited to a small number of drives.(What that page says about not being able to reset it is... not true.)Likewise, I'm skeptical of "neither the SSD nor the data can be recovered" --- they just want you to buy a new one.Tangentially related, I wonder how many modern cars will stop working once the odometer rolls over.

评论 #21637951 未加载

评论 #21638134 未加载

评论 #21641142 未加载

abarringerover 5 years ago

Since most drives are started and used concurrently this bug would blow any RAID set up. There's a dark day coming for some sysadmins.

评论 #21637884 未加载

评论 #21637874 未加载

评论 #21638912 未加载

评论 #21638038 未加载

verytrivialover 5 years ago

> HPE was notified by a Solid State Drive (SSD) manufacturer [...]That's a curious bit of context. It seems to imply they're shifting some of the blame onto their manufacturer? I makes me wonder if this firmware is 100% HPE specific, or if there a 2^16 hours bug about to bite a bunch of other pipelines.

评论 #21638060 未加载

评论 #21637944 未加载

评论 #21639818 未加载

评论 #21637980 未加载

voiper1over 5 years ago

>The issue affects SSDs with an HPE firmware version prior to HPD8 that results in SSD failure at 32,768 hours of operation (i.e., 3 years, 270 days 8 hours). After the SSD failure occurs, neither the SSD nor the data can be recovered. In addition, SSDs which were put into service at the same time will likely fail nearly simultaneously.Looks like some sort of run time stored in a signed 2 byte integer. Oops.

评论 #21637750 未加载

评论 #21637753 未加载

pabs3over 5 years ago

Would be nice if the standard firmware update mechanism on Linux (fwupd/LVFS) could be used for HPE products.<a href="https://fwupd.org/lvfs/vendors/" rel="nofollow">https://fwupd.org/lvfs/vendors/</a> <a href="https://fwupd.org/lvfs/devices/" rel="nofollow">https://fwupd.org/lvfs/devices/</a>

评论 #21639320 未加载

zozbot234over 5 years ago

Ouch. I wonder how many non-enterprise SSD's come with similar bugs, and zero support by the firmware vendor.

评论 #21637726 未加载

评论 #21637863 未加载

_bxg1over 5 years ago

Whatever the counter is, the fact that it's 32,768 instead of 65,536 suggests they used a signed int for something that presumably starts at zero and increases monotonically... Avoiding just that mistake would've given them twice as much time - nearly 7.5 years - which seems like it'd be longer than these drives would typically last anyway.

评论 #21647597 未加载

评论 #21640860 未加载

pjc50over 5 years ago

Amazing. A repeat of the "Windows 95 crashes after 48 days uptime" and other timer rollover bugs.

评论 #21638125 未加载

S_A_Pover 5 years ago

I just want to know how many of these failed at 32768 hours before they had their oh sh*t moment.

评论 #21639482 未加载

gruezover 5 years ago

>By disregarding this notification and not performing the recommended resolution, the customer accepts the risk of incurring future related errors.How is this work legally? For one, how would HPE prove that the customer read the bulletin? I don't imagine they're sending these out via certified mail.

bobowzkiover 5 years ago

At the hospital where I work, almost all HP desktops crashed within a few months...

评论 #21638941 未加载

iveqyover 5 years ago

Probably related to <a href="https://news.ycombinator.com/item?id=21471997" rel="nofollow">https://news.ycombinator.com/item?id=21471997</a>

EvanAndersonover 5 years ago

I did some recon on eBay looking for used units w/ the affected SKUs for sale and they appear to be Samsung units.

annoyingnoobover 5 years ago

Whew, dodged that bullet, looks like I'm not using any of the affected drives. Lucky me, for now.

paggleover 5 years ago

Yikes! This is why when I built my home NAS I used five different drives and manufacturers.

16 comments

jzwinckover 5 years ago

评论 #21638513 未加载

评论 #21638871 未加载

评论 #21639969 未加载

评论 #21638319 未加载

userbinatorover 5 years ago

评论 #21637951 未加载

评论 #21638134 未加载

评论 #21641142 未加载

abarringerover 5 years ago

Since most drives are started and used concurrently this bug would blow any RAID set up. There's a dark day coming for some sysadmins.

评论 #21637884 未加载

评论 #21637874 未加载

评论 #21638912 未加载

评论 #21638038 未加载

verytrivialover 5 years ago

评论 #21638060 未加载

评论 #21637944 未加载

评论 #21639818 未加载

评论 #21637980 未加载

voiper1over 5 years ago

评论 #21637750 未加载

评论 #21637753 未加载

pabs3over 5 years ago

评论 #21639320 未加载

zozbot234over 5 years ago

Ouch. I wonder how many non-enterprise SSD's come with similar bugs, and zero support by the firmware vendor.

评论 #21637726 未加载

评论 #21637863 未加载

_bxg1over 5 years ago

评论 #21647597 未加载

评论 #21640860 未加载

pjc50over 5 years ago

Amazing. A repeat of the "Windows 95 crashes after 48 days uptime" and other timer rollover bugs.

评论 #21638125 未加载

S_A_Pover 5 years ago

I just want to know how many of these failed at 32768 hours before they had their oh sh*t moment.

评论 #21639482 未加载

gruezover 5 years ago

bobowzkiover 5 years ago

At the hospital where I work, almost all HP desktops crashed within a few months...

评论 #21638941 未加载

iveqyover 5 years ago

Probably related to <a href="https://news.ycombinator.com/item?id=21471997" rel="nofollow">https://news.ycombinator.com/item?id=21471997</a>

EvanAndersonover 5 years ago

I did some recon on eBay looking for used units w/ the affected SKUs for sale and they appear to be Samsung units.

annoyingnoobover 5 years ago

Whew, dodged that bullet, looks like I'm not using any of the affected drives. Lucky me, for now.

paggleover 5 years ago

Yikes! This is why when I built my home NAS I used five different drives and manufacturers.

HPE Drive fail at 32,768 hours without firmware update

16 comments

HPE Drive fail at 32,768 hours without firmware update

16 comments