TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

No Single Points of Failure

24 pointsby honoredbalmost 11 years ago

3 comments

lsh123almost 11 years ago
A good summary with one exception: the monitoring, instrumentation and logging didn't get enough attention. The failure is the norm so firsts and foremost you want to know when a failure occurs and then you need to be able to investigate what went wrong. You should literally monitor/instrument everything: every API call, every DB access, every page rendering should include code to monitor latency, result codes, payload size, etc. Every error (even benign) should be logged preferably with stack traces. Any unexpected condition should be checked and logged. All the instrumentation data should be graphed and stored for a long period of time so you can analyze the impact of your code changes on system performance and correlate it with system failures.
colechristensenalmost 11 years ago
I don&#x27;t like &#x27;no single point of failure&#x27; maxim because I think it leads people to make strange or incorrect decisions and neglect things in order to serve the maxim instead of doing what&#x27;s best.<p>Being &#x27;fail safe&#x27; is much more important than being redundant. That is, you need to design your product&#x27;s failure. How well it works and how rarely it fails are important, but not nearly as important as how well it fails.<p>This means monitoring for knowing when it fails, auditing for knowing how it did fail after the fact, backups for after the fact, and most importantly (and harder to define) is predicting what can fail and how and designing your product&#x27;s behavior after that failure.
评论 #7901324 未加载
bittermangalmost 11 years ago
There&#x27;s always a single point of failure. The user.