TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Attack of the cosmic rays: Undetected memory errors can happen to you

118 点作者 nelhage将近 15 年前

12 条评论

mattmanser将近 15 年前
<i>Since that incident, I’ve had several other, similar problems. Something would start failing mysteriously, but flushing my cache restored it to normal.</i><p>This seems like a bit of a red flag that in reality something else is actually going wrong with his computer.
评论 #1458317 未加载
yellowbkpk将近 15 年前
To give you an idea of density/frequency of this occurring: my wife's CCD for her PhD experiments routinely (roughly 1 in 5) pick up huge spikes from cosmic rays during her 30-second exposures. The CCD is less than an inch square and she's 2 floors down from ground level.
评论 #1458584 未加载
tetha将近 15 年前
Reading this, I remember how hard NASA works to get their sattelites and probes secure against cosmic rays, because out there in space, cosmic rays cause your memory to become pretty unpredictable. Error correcting codes and redundancy suddenly become really important, even though you are crammed into this little embedded system which has less processing power than some input devices these days.
评论 #1458899 未加载
评论 #1459698 未加载
rubyrescue将近 15 年前
Inspiring for the seeming ease with which he moves between package managers and debuggers...
thingie将近 15 年前
I don't say that cosmic rays cannot happen (well, they absolutely certainly do, I mean whether they can cause memory corruption that actually make some difference in the running system), but this is quite strange. No such faults were happening before this single incident and now, there many similar faults happening regularly? Why should I suspect the cosmic rays (was there any reason for such a sudden change in their activity and visibility of it?) and not an hardware fault?
ajb将近 15 年前
These kinds of memory errors are more often caused by alpha particles emitted by radioactive elements in the chip package: <a href="http://en.wikipedia.org/wiki/Soft_error" rel="nofollow">http://en.wikipedia.org/wiki/Soft_error</a>
gacba将近 15 年前
For those who want to know more about cosmic rays, Wikipedia is filled with goodness on the subject. (<a href="http://en.wikipedia.org/wiki/Cosmic_ray" rel="nofollow">http://en.wikipedia.org/wiki/Cosmic_ray</a>) I was looking for stats on average density per m2 to determine just how prevalent this effect might be in ground-based electronics. It's been a major problem with high-altitude and satellite equipment for a long, long time.
seanlinmt将近 15 年前
From my experience, I think it is unlikely to be due to cosmic rays.Most likely culprit could be power supply or data buffers. Those non tantalum capacitors then to end of life faster if you're operating in high humidity conditions.<p>This reminded me of a number of random crashes that a client of my previous company had. Stackdumps just showed random errors. We had about a years worth of crash logs from a couple thousand of network switches (they were an ISP). We initially suggested that this might be a problem with cosmic rays. We even checked the frequency of the random crashes with sunspot cycles. No relationship found. Turns out it was due to another component failing due to a design error.
fictorial将近 15 年前
Great work digging into this issue. A memory test is probably in order.<p>I learned about ECC RAM when I was trying to figure out why server lease deals were so inexpensive relative to others. For instance, the last I checked, hetzner.de's hardware does not support ECC RAM. I am of course not calling out hetzner, and there are other factors in such deals.
Luyt将近 15 年前
Idea: Use pieces of lead sheeting to shield the RAM chips from cosmic radiation.
评论 #1458436 未加载
评论 #1458447 未加载
gcb将近 15 年前
a common saying in the medic industry:<p>sometimes a zebra is just a horse.
kunley将近 15 年前
Segfaults from Outer Space !! Duck and cover