TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Serious Intel CPU bugs (2016)

562 pointsby derekover 7 years ago

15 comments

slivymover 7 years ago
As a former Intel employee this aligns closely with my experience. I didn't work in validation (actually joined as part of Altera) but velocity is an absolute buzzword and the senior management's approach to complex challenges is sheer panic. Slips in schedules are not tolerated at all - so problems in validation are an existential threat, your project can easily just be canned. Also, because of the size of the company the ways in which quality and completeness are 'acheived' is hugely bureaucratic and rarely reflect true engineering fundamentals. Intel's biggest challenge is simply that it's not 'winning big' at the moment and rather than strong leadership and focus the company just jumps from fad to fad failing at each (VR is dead, long live automotive).
评论 #16060304 未加载
评论 #16059791 未加载
评论 #16067258 未加载
bwang29over 7 years ago
Quote from article &quot;We need to move faster. Validation at Intel is taking much longer than it does for our competition. We need to do whatever we can to reduce those times… we can’t live forever in the shadow of the early 90’s FDIV bug, we need to move on. Our competition is moving much faster than we are&quot;.<p>Competition pressure could make a company&#x27;s new product worse than (in this case, less stable than) their previous products, e.x. Samsung phone explosion. I still remembered the story was Samsung wanting to release their phone ahead of iPhone and I would imagine the testing went through a similar stressful time as Intel.<p>Of course not all cases of taking such risks would lead to disasters - just imagine Intel rushes on releasing new chips ahead of competition and 99 out of 100 times it ended up performing well. But a unique character in Intel&#x27;s case is these bugs, unlike a faulty battery design, are accumulative and additive to future product development, which means a few small wins in catching up with your competitor could also lead to massive failures in some next major battle.<p>Now imagine Intel&#x27;s competitors are going through the exact same scenario. One possible outcome is both Intel and its competitors&#x27; products become less stable and more buggy over time, and until everyone&#x27;s stuff seems to be broken they probably never have time to fix them.
评论 #16060048 未加载
评论 #16063948 未加载
评论 #16062268 未加载
评论 #16059475 未加载
paulmdover 7 years ago
Denverton is much more complex than a &quot;simple&quot; Atom (performance of a C3958 is up to about half of an i5-7500 in single-thread, twice the total multi-thread performance). Avoton is really no slouch either. It&#x27;s really not surprising that the incidence of bugs is increasing on those uarchs as the complexity grows.<p>The Skylake&#x2F;Kaby hyperthread bug has been fixed in microcode and is no longer applicable. It&#x27;s perfectly safe to run HT on these processors now.<p>The AMD Ryzen segfault remains unmitigated at this point in time. Phoronix rushed to declare the bug fixed because they got a binned RMA replacement but there are plenty of reports of it occurring in current-production processors to at least a moderate degree, roughly proportionate with ASIC&#x2F;litho quality. It&#x27;s unclear what the scope is w&#x2F;r&#x2F;t Epyc since Epyc is on a different stepping but also hasn&#x27;t really ramped yet either. The early Epyc processors were essentially engineering samples (on the order of hundreds to single-digit thousands of samples) with no real (public) visibility into any binning that might be taking place.<p>The Ryzen high-address bug is no big deal, that&#x27;s the kind of thing that gets patched all the time (like the Skylake HT bug). That&#x27;s one thing Dan is glossing over here - there are tons of these bugs all the time and as long as there is an effective mitigation available it&#x27;s no big deal.<p>The PTI patch can be viewed as making syscalls take somewhat longer (about double iirc). Gamers and compute-oriented workloads won&#x27;t be hurt hardly at all. The average mixed-workload case sees 5% performance loss, not ideal but it&#x27;s not critical either. Losing 30% is real bad though, and that&#x27;s what you will get on IO-heavy workloads that context-switch into the kernel a lot.<p>The only real mitigation there appears to be right now is to give up hyperconvergence for now and harden up those DB&#x2F;NAS servers that are going to be pushing a lot of IO so that you know there won&#x27;t be hostile code running on them. That will allow you to safely disable PTI and sidestep the performance hit.<p>Of course, Epyc was not that good at running databases in the first place, so you still might be better off sucking it up and running Intel even with the PTI patch. It will probably depend on your actual workload and the relative amount of IO vs processing.
评论 #16062854 未加载
评论 #16061181 未加载
dataflowover 7 years ago
Out of curiosity, if you notice a CPU bug in a computer under warranty, is there anything the vendor is usually obligated do, or are they under no obligation to do anything about a CPU bug? Is that considered a defect they have to handle?<p>(Edit: I&#x27;m assuming the USA, and I&#x27;m assuming bugs that were not known to the vendor at the time of the sale.)
评论 #16060231 未加载
评论 #16060329 未加载
评论 #16059538 未加载
chrisperover 7 years ago
So I recently bought an 8700k. I was wondering if I should rather return it and get AMD instead? Not sure how much the recent bugs will impact me performance wise.
评论 #16059488 未加载
评论 #16059730 未加载
评论 #16061885 未加载
pier25over 7 years ago
It annoys me when some online content (or edit) hasn&#x27;t a date explicitly stated.
lukaxover 7 years ago
Is this article from 2015 or 2017?<p>It would be great if the page displayed the date that the article was posted&#x2F;updated. It is not in the URL nor the sources. The only way to see the dates is in the RSS feed and even that is only for new articles.
评论 #16059710 未加载
mtgxover 7 years ago
<i>As someone who worked in an Intel Validation group for SOCs until mid-2014 or so I can tell you, yes, you will see more CPU bugs from Intel than you have in the past from the post-FDIV-bug era until recently.<p>Why?<p>Let me set the scene: It’s late in 2013. Intel is frantic about losing the mobile CPU wars to ARM. Meetings with all the validation groups. Head honcho in charge of Validation says something to the effect of: “We need to move faster. Validation at Intel is taking much longer than it does for our competition. We need to do whatever we can to reduce those times… we can’t live forever in the shadow of the early 90’s FDIV bug, we need to move on. Our competition is moving much faster than we are” - I’m paraphrasing.<p>Many of the engineers in the room could remember the FDIV bug and the ensuing problems caused for Intel 20 years prior. Many of us were aghast that someone highly placed would suggest we needed to cut corners in validation - that wasn’t explicitly said, of course, but that was the implicit message. That meeting there in late 2013 signaled a sea change at Intel to many of us who were there. And it didn’t seem like it was going to be a good kind of sea change. Some of us chose to get out while the getting was good. As someone who worked in an Intel Validation group for SOCs until mid-2014 or so I can tell you, yes, you will see more CPU bugs from Intel than you have in the past from the post-FDIV-bug era until recently.</i><p>So this is why Krzanich sold his stock. He knows the bug is his fault. Whoops. I think someone may &quot;quit for personal reasons&quot; soon.<p><a href="https:&#x2F;&#x2F;www.fool.com&#x2F;investing&#x2F;2017&#x2F;12&#x2F;19&#x2F;intels-ceo-just-sold-a-lot-of-stock.aspx" rel="nofollow">https:&#x2F;&#x2F;www.fool.com&#x2F;investing&#x2F;2017&#x2F;12&#x2F;19&#x2F;intels-ceo-just-so...</a>
_0w8tover 7 years ago
I can only expect that future will be worse. It may be that VM providers will find it unprofitable to offer a VM capable of running generic native code. Another thing that security products for desktops like Qubes OS that rely on hardware isolation to run untrusted code may need to reconsider their business model.
评论 #16061808 未加载
johnflanover 7 years ago
I have seen a lot of talk of AMD benefitting from this but what about ARM - how are their server offerings shaping up?
评论 #16064893 未加载
drejover 7 years ago
Can we put [2016] in the title? Thanks!
评论 #16063926 未加载
hungerstrikeover 7 years ago
All the CPUs in my house are 4th and 5th generation Intel CPUs except for one PC laptop that has a Skylake processor.<p>I guess I&#x27;m glad now that Apple put a 2 year old CPU in the early 2015 Macbook Pro! Besides my 2012 Mac Pro, that is the most expensive machine in the house!
评论 #16063930 未加载
scribuover 7 years ago
I wonder if this is related to the recently discovered design flaw in Intel CPUs:<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=16055395" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=16055395</a>
juanmirocksover 7 years ago
Considering the late Intel problems, Apple is going to be even more tempted to design its own CPUs&#x2F;GPUs for the mac.<p>What do you think, is this realistic?
评论 #16064707 未加载
prewettover 7 years ago
Before we all jump on Intel for being buggy, what&#x27;s the list of serious AMD bugs like for the past five years? If AMD has a similar amount of bugs we should jump on them, too. If not, then we will actually know that Intel deserves being jumped on to the exclusion of AMD.