TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

A summary of how not to measure latency

36 点作者 juanrossi超过 9 年前

2 条评论

cortesoft超过 9 年前
Some of this is good, but the idea that you can determine your likelihood of experiencing a 99th percentile latency on a webpage by the naive probability calculation shown (1 - .99^n where n is the number of objects requested on a page) is silly. That is assuming that latency is completely randomly distributed across all objects and all clients to a page.<p>This is completely not true. Latency is very dependent on the client requesting and the object being requested. You are going to get clustering, not an even distribution.
评论 #10735502 未加载
jonaf超过 9 年前
Doesn&#x27;t this assume a single-threaded application? The example of a clerk&#x27;s service time and people waiting in line is oversimplified. Modern systems have maybe 100 clerks per store, and many stores; how do you perform a &quot;Ctrl+Z&quot; test in this case? Even if you had a perfectly divided line of people waiting at each cashier in each store (machine), the worst case would be experienced people in line for the store or clerk with a reduced service time. Thus, for accuracy, you would need to measure queue depth at the maximum latency per thread (clerk) and add that latency to each subsequent request until you serve the number of reuqests in your queue. This kind of math requires constant sampling that would slow down any system so dramatically it would defeat the purpose. I think this becomes even more clear when you consider that most such systems have load balancing strategies that further mitigate queue depths such that they are intentionally distributed based on which backend services have the lowest historical latencies (and yes, I realize these algorithms are likely plagued by the same &quot;omission conspiracy&quot; mentioned -- but they certainly don&#x27;t uniformly distribute requests).<p>In summary, let&#x27;s focus on the max latency, home in on which backend exhibited said latency, identify the depth of the queue at the time that latency was experienced, and use that information to model the impact to users. From this, I expect you can draw some meaningful percentiles in terms of latency distributions, and without having to measure more data points than feasible without decreasing latency further.<p>Am I misunderstanding something? I&#x27;m no math whiz, this is mostly intuition.