TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How to find 'error budget' as a DevOps Engineer?

3 点作者 FahadUddin92超过 6 年前
I am trying to find an error budget for a site that has several outside APIs integrated in it for its core features. How can I find the error budget for it?

1 comment

Juliate超过 6 年前
Error budget = the actual downtime duration your site can still afford within a given time frame.<p>If you have the SLA of your outside APIs, you may compute your own maximum possible SLO and deduce from that your full error budget. But your error budget will diminish over time, as you use it.<p>Say your site depends on 3 external APIs having each a 99% SLA, your best possible site SLO would be 99% x 99% x 99% = 97% (= your site is, at best, as much reliable as the product of the reliability of your dependencies).<p>That is, unless your site has some built-in tactics for the specific downtime scenarios of these APIs (caching, retry, slow down, graceful limitation of features, etc.).<p>Should you pick a lower SLA than your SLO for your site then? Always. Things happen.<p>Let&#x27;s take 95% SLA for simplicity.<p>Your max error budget would be, for 30 days, as a formula:<p><pre><code> + total time frame (say, 30 days = 720 hours) - target availability (at 97% avail. that would be 684 hours) - total downtime you&#x27;ve had already within this time frame = 36 hours or less </code></pre> That&#x27;s a start. Then you may track your actual production own indexes and adjust accordingly.<p>Reminder: <a href="https:&#x2F;&#x2F;enqueuezero.com&#x2F;the-difference-between-sli-slo-and-sla.html" rel="nofollow">https:&#x2F;&#x2F;enqueuezero.com&#x2F;the-difference-between-sli-slo-and-s...</a>