TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

An Empirical Analysis of Hardware Failures on a Million Consumer PCs

150 点作者 mbafk将近 13 年前

13 条评论

cs702将近 13 年前
Very useful -- I will take this analysis into account when it's time to upgrade my current personal machine or configure the next one! Thank you for posting this here.<p>The only thing I would have wanted to see but didn't in this analysis is how failure rates vary for different types of disk subsystem -- specifically, traditional hard drives versus the newer solid-state devices. I suspect, but don't know for sure, that the latter have much, much lower real-world failure rates in the first 30 days of total accumulated CPU time (TACT).<p>The authors openly suggest that the sharp difference in failure rates between desktop and laptop machines may be due in part to their disk subsystems: "Laptops are between 25% and 60% less likely than desktop machines to crash from a hardware fault over the first 30 days of observed TACT. We hypothesize that the durability features built into laptops (such as motion-robust hard drives) make these machines more robust to failures in general." Alas, the authors don't delve any further into it.<p>I'd like to see hard data comparing the real-world failure rates of <i>both</i> desktops and laptops using traditional versus solid-state disk subsystems.
评论 #4161861 未加载
评论 #4162945 未加载
评论 #4162124 未加载
评论 #4161868 未加载
mrb将近 13 年前
When Microsoft, Google, or some university publish analysis of hardware failures across large numbers of machines, they always anonymize hardware vendors ("vendor A", "vendor B").<p>I understand the reasons (not alienating your hardware vendors), but will there ever be a research group who will disclose vendor names? Heck, I would <i>pay</i> for this information.
评论 #4164944 未加载
评论 #4166070 未加载
评论 #4164663 未加载
wazoox将近 13 年前
Among other interesting insights:<p>* a machine that crashed once is 100 times more likely to crash again; the more it crashes, the more it's prone to fail again.<p>* overclocking significantly reduces reliability. One CPU vendor (AMD or Intel, but unspecified) is much worse in this regard, too.<p>* conversely, underclocking improves reliability.<p>* branded computers are more reliable than beige boxes.<p>* laptops are more reliable than desktops.
评论 #4162075 未加载
ChrisNorstrom将近 13 年前
I'm having a hard time coming to terms with "Laptops less likely to crash from hardware fault than desktops"<p>Everything we've learned from experience, surveys, and PC World magazines has showed the opposite. Heat kills hardware and laptops have their hardware packed together so closely that it generates lots of heat. Back then I remember reading something like 1 in 4 laptops fail in the first 3 years. Which was very believable, at the time I was in collage for game design &#38; development. All 80 guys in our class had laptops from HP (with get this... Pentium 4s in them). Those laptops had a LOT of problems. They were basically portable heaters.<p>So I guess laptops now have either much better cooling, much cooler CPUs or a combination. OR PCs are just terribly cooled.
评论 #4164590 未加载
评论 #4164124 未加载
评论 #4164123 未加载
josephturnip将近 13 年前
Interesting stuff. You can improve reliability by running your system at a lower speed. Here's a blog post with a summary of some of the conclusions of the paper above: <a href="http://grano.la/blog/2012/06/improve-the-reliability-of-your-pc/" rel="nofollow">http://grano.la/blog/2012/06/improve-the-reliability-of-your...</a> (Disclaimer: that's my company's blog)<p>One question I still have is whether the switching of CPU frequencies has any effect, or if it is only the average speed that correlates to the reliability. Anecdotal evidence suggests that this is the case, but it could be an area for further research.
kristaps将近 13 年前
Interesting, too bad the power supplies could not be controlled in their setup, as a wonky power supply can unleash all kinds of gremlins that look like failures in components down the line.
评论 #4164102 未加载
评论 #4161662 未加载
Zenst将近 13 年前
Interesting read though why can't Microsoft just tell me that my CPU or HD or memory is borking and suggest I RMA it instead of saying everytime - have you applied the latest updates, which I get to click unhelpful.<p>Most important thing in a PC I have found for reliability above everything else is a good PSU, realy does make a difference on the hardware side as you give your kit cleaner power. Add UPS/surge protector and you can double the lifetime of kit. Least from experience I've had it has been noticable.
Hoff将近 13 年前
The copy at Microsoft Research is offline.<p>Here's another copy of the paper:<p><a href="http://eurosys2011.cs.uni-salzburg.at/pdf/eurosys2011-nightingale.pdf" rel="nofollow">http://eurosys2011.cs.uni-salzburg.at/pdf/eurosys2011-nighti...</a>
评论 #4163113 未加载
acqq将近 13 年前
There are a lot of insights in the paper, but I'd really like to know about this:<p>"The table shows that CPUs from Vendor A are nearly 20x as likely to crash a machine during the 8 month observation period when they are overclocked, and CPUs from Vendor B are over 4x as likely"<p>Obviously it's 5 times difference in probability to have unstable system if overclocked between Intel and AMD but they don't say which one is better. Anybody knows?
评论 #4161704 未加载
评论 #4162019 未加载
hollerith将近 13 年前
The result most surprising to me is that laptops are between 25% and 60% less likely than desktop machines to crash from a hardware fault during the first 30 days worth of measurements.<p>The much larger weight and volume of desktops would seem to make them easier to cool.
评论 #4161989 未加载
hollerith将近 13 年前
Too bad CPU temperature was not part of the collection of data used in the study.
latch将近 13 年前
Is there a compelling reason for this to be a PDF rather than HTML? I'm genuinely curious.
评论 #4162179 未加载
评论 #4161539 未加载
评论 #4161540 未加载
评论 #4163750 未加载
stcredzero将近 13 年前
I've always said that smart hardware tinkerers <i>underclock</i>. It produces less heat, and results in a quieter machine. I always suspected it improves reliability.
评论 #4165347 未加载