TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Crash-Only Software and Recursive Microreboots

25 点作者 firloop大约 3 年前

5 条评论

colanderman大约 3 年前
Most (all?) of the paper links appear to be dead. Here&#x27;s some slides on the topic: <a href="http:&#x2F;&#x2F;roc.cs.berkeley.edu&#x2F;retreats&#x2F;winter_03&#x2F;slides&#x2F;candea_crashonly.pdf" rel="nofollow">http:&#x2F;&#x2F;roc.cs.berkeley.edu&#x2F;retreats&#x2F;winter_03&#x2F;slides&#x2F;candea_...</a><p>Interesting to see a name put to this. I used this technique (not having heard of it before) when developing a short but finicky and critical cloud-coordinated cross-datacenter disaster recovery consensus algorithm a few years ago. There was a point at which the algorithm recorded its state and a timestamp to local storage, so progress and correctness were guaranteed in the face of a crash. I came to the same conclusions as these researchers -- this state transition did not need to be fast, but <i>did</i> need to be absolutely robust and well tested. So rather than code and test both the &quot;happy path&quot; and the crash-recovery path -- I just called `_Exit(1)` and only coded and tested the crash-recovery path.<p>(It wasn&#x27;t a perfect confluence of testing space -- my code exited in a way that did not trigger a core dump; whereas most failure modes did. But generally, core dumps bogging down an already-ill system were an issue we had to contend with.)<p>Would be interesting to see a DSL designed around this technique -- a semi-persisted program state of sorts -- with the happy-path automatically generated as an optimization.
strofcon大约 3 年前
So... Erlang &#x2F; OTP? Sweet.
评论 #30851559 未加载
评论 #30851384 未加载
评论 #30851296 未加载
macintux大约 3 年前
Here’s a link to the first paper listed: <a href="https:&#x2F;&#x2F;www.usenix.org&#x2F;legacy&#x2F;event&#x2F;osdi04&#x2F;tech&#x2F;full_papers&#x2F;candea&#x2F;candea.pdf" rel="nofollow">https:&#x2F;&#x2F;www.usenix.org&#x2F;legacy&#x2F;event&#x2F;osdi04&#x2F;tech&#x2F;full_papers&#x2F;...</a>
评论 #30851596 未加载
ccvannorman大约 3 年前
It would be amazing if my OS had this capability. It is frustrating when things get buggy, slow, and crash-y, and it eats up a lot of my attention to wait for the application &#x2F; system to reboot. If it could be constantly rebooting, even if it was slower overall, the mitigation of sudden bursts of slow would be VERY worth it.<p>Why are we not funding this!?
elliottkember大约 3 年前
I once had a router that would get slower and slower. Rebooting it brought it back to life. I had a timer power strip, and set it to power off at 2am for 15 minutes every night. It worked beautifully.
评论 #30851353 未加载