TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Failsafe – failure handling with retries, circuit breakers and fallbacks

116 pointsby jodahalmost 9 years ago

8 comments

dredmorbiusalmost 9 years ago
A note on the name: &quot;fail-safe&quot; in engineering doesn&#x27;t mean that a system <i>cannot</i> fail, but rather, that when it does, it does so in the safest manner possible.<p>The term originated with (or is strongly associated with) the Westinghouse railroad brake system. These are the pressurised air brakes on trains, in which air pressure holds the brake shoes <i>open</i> against spring pressure. Should integrity of the brakeline be lost, the brakes will fail in the activated position, slowing and stopping the train (or keeping a stopped train stopped).<p><a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Railway_air_brake" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Railway_air_brake</a><p>Fail-safe designs and practices can lead to some counterintuitive concepts. Aircraft landing on carrier decks, in which they are arrested by cables, apply full engine power and afterburner on landing. The idea is that should the arresting cable or hook fail, the aircraft can safely take off again.<p><a href="https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Fail-safe" rel="nofollow">https:&#x2F;&#x2F;en.m.wikipedia.org&#x2F;wiki&#x2F;Fail-safe</a><p>Upshot: &quot;fail safe&quot; doesn&#x27;t mean &quot;test all your failure conditions exhaustively&quot;. It may well mean to abort on any failure mode (see djb&#x27;s software for examples). The most important criterion is that whatever the failure mode be, it be as safe as possible, and almost always, based on a very simple and robust design, mechanism, logic, or system.<p>From the description of this project, it strikes me that it may well be failing (unsafely?) to implement these concepts. Charles Perrow, scholar of accidents and risks, notes that it&#x27;s often safety and monitoring systems themselves which play a key role in accidents and failures.
评论 #12151684 未加载
评论 #12151686 未加载
评论 #12151304 未加载
nitrogenalmost 9 years ago
Very cool. Consistent and clear retry, backoff, and failure behaviors are an important part of designing robust systems, so it&#x27;s disappointing how uncommon they are. If I were starting a new Java project today I would almost certainly want to use this library instead of the various threads and timers I had to hack together years ago.
评论 #12152722 未加载
SwellJoealmost 9 years ago
This title would be 100% better with &quot;for Java&quot; on the end.
评论 #12152237 未加载
ckugblenualmost 9 years ago
Quite interesting. It shows potential to be used in numerous use cases. Anyone know of similar projects in other languages like Python and Javascript?
评论 #12150410 未加载
评论 #12150588 未加载
评论 #12150396 未加载
评论 #12151094 未加载
评论 #12150874 未加载
评论 #12151237 未加载
评论 #12151056 未加载
cpitmanalmost 9 years ago
How is this distinct from Hystrix (<a href="https:&#x2F;&#x2F;github.com&#x2F;Netflix&#x2F;Hystrix" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Netflix&#x2F;Hystrix</a>)? Why should I use one over the other?
评论 #12152092 未加载
ap22213almost 9 years ago
It seems like a well-thought, fluent interface to what lots of Java developers (especially Java 8 ones) inevitably have to write themselves.
mandeepjalmost 9 years ago
Please find some of these patterns for .net\azure\c# stack here - <a href="https:&#x2F;&#x2F;msdn.microsoft.com&#x2F;en-us&#x2F;library&#x2F;dn568099.aspx" rel="nofollow">https:&#x2F;&#x2F;msdn.microsoft.com&#x2F;en-us&#x2F;library&#x2F;dn568099.aspx</a>
fdsaafalmost 9 years ago
Beware of runaway retries: <a href="https:&#x2F;&#x2F;blogs.msdn.microsoft.com&#x2F;oldnewthing&#x2F;20051107-20&#x2F;?p=33433" rel="nofollow">https:&#x2F;&#x2F;blogs.msdn.microsoft.com&#x2F;oldnewthing&#x2F;20051107-20&#x2F;?p=...</a><p>Personally, I&#x27;d rather systems fail quickly, with retries only at the highest (application) and lowest (TCP) levels.