TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Your nines are not my nines (2019)

106 点作者 thewarpaint超过 1 年前

8 条评论

sjsdaiuasgdia超过 1 年前
This is a concept I've had to explain to entirely too many teams over the years, that 0.001% of requests failing as a (mostly) random distribution of all requests is very different than a 0.001% subset of requests that will fail (nearly) every time until the underlying issue is mitigated. They look the same on a high level dashboard but they are completely different conditions in terms of how the customer will feel it, and understanding which kind of problem you have also guides the investigation and troubleshooting process.
评论 #37679809 未加载
RajT88超过 1 年前
The way it works with cloud providers is - you can file for a refund for SLA breach. After all - those SLA&#x27;s are at a service level for the customer. If you&#x27;re yelling at support or engineering on the phone, you&#x27;re likely getting the 9&#x27;s treatment the author describes - this is the wrong forum to hold the provider accountable unless you&#x27;re yelling about mitigation time (then, best of luck to you!).<p>Reading the fine print on the SLA&#x27;s is extremely important, because they often do not say what you think they say.<p><a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;legal&#x2F;service-level-agreements&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;aws.amazon.com&#x2F;legal&#x2F;service-level-agreements&#x2F;</a> <a href="https:&#x2F;&#x2F;www.microsoft.com&#x2F;licensing&#x2F;docs&#x2F;view&#x2F;Service-Level-Agreements-SLA-for-Online-Services" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.microsoft.com&#x2F;licensing&#x2F;docs&#x2F;view&#x2F;Service-Level-...</a> <a href="https:&#x2F;&#x2F;cloud.google.com&#x2F;terms&#x2F;sla&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;cloud.google.com&#x2F;terms&#x2F;sla&#x2F;</a><p>I have seen refunds on the order of hundreds of thousands of dollars. It&#x27;s cold comfort if the impact to you was on the order of millions of dollars, but still it is something. As you can see it&#x27;s not a free-money-a-thon, it&#x27;s generally a % of your spend of the services which were not available.<p>There typically is a defined process for submitting a refund ticket, which will result in an availability review. This documented process is not always easy to find.<p>The only one I could easily find is for Microsoft:<p><a href="https:&#x2F;&#x2F;learn.microsoft.com&#x2F;en-us&#x2F;partner-center&#x2F;request-credit#service-outages-service-level-agreement-issues-credit" rel="nofollow noreferrer">https:&#x2F;&#x2F;learn.microsoft.com&#x2F;en-us&#x2F;partner-center&#x2F;request-cre...</a><p>(It&#x27;s just a support topic when you&#x27;re submitting a support ticket)
评论 #37680639 未加载
Animats超过 1 年前
<i>&quot;You are the bug on the windscreen of the locomotive. The train has no idea you were ever there.&quot;</i> - Rachel by the Bay.<p>That&#x27;s how monopolies work. They need not fear their customers.<p>In time, this becomes Orwell&#x27;s &quot;If you want a vision of the future, imagine a boot stamping on a human face – forever.&quot; Ask anyone who&#x27;s had a dispute with the Apple app store.
hughesjj超过 1 年前
Hot take:<p>I would love to have service providers show their (down sampled!) Alarms actually used for operational excellence publicly (from a read replica&#x2F;etc)<p>Doing so would enforce that you actually have those in place, since they&#x27;re public and now a marketing point. That said, I get the concern of trolls and competitors trying to get a &quot;low score&quot;.
评论 #37678240 未加载
评论 #37680981 未加载
评论 #37678688 未加载
评论 #37678144 未加载
hinkley超过 1 年前
There&#x27;s an old joke that goes something like, &quot;Most of the people chasing five nines uptime achieved five eights.&quot;
评论 #37686486 未加载
评论 #37683610 未加载
thegrim33超过 1 年前
Sure, there&#x27;s the issue of what your contract says and what the guarantee is, but all these companies do already track their metrics in ways that at least attempt to detect and respond to the problems the author describes.<p>They track their metrics by p50 (the average performance&#x2F;reliability for everyone) but also by p99, p99.9, etc., which is the performance&#x2F;reliability for the extreme edge cases, such as exactly what the author is describing. They already do evaluate their systems from the perspective of how it&#x27;s performing for the worst affected customers. Again, maybe the issue is the contract itself, sure, but they do already try their best to prevent a small handful of customers from getting overly affected by something.
评论 #37689519 未加载
评论 #37681106 未加载
评论 #37680758 未加载
tekla超过 1 年前
I dont really get why Cloud matters here. The exact same dynamic exists for on-prem services.
评论 #37677806 未加载
评论 #37678488 未加载
评论 #37677800 未加载
评论 #37682877 未加载
ChrisArchitect超过 1 年前
(2019)
评论 #37675469 未加载