TechEcho

13 comments

lotyrinover 9 years ago

Quantile Digest (Q-digest) or something similar is what I believe is desired here.From what I understand it's a fixed size data structure, represents quantiles as a tree of histogram bands, pruning nodes with densities that vary the least from their parents to achieve error / size trade-off. They also have the property that you can merge them together and re-compress in order to turn second data into minute data, or compress more accurate (large) archival digests into smaller ones to say, support stable streaming of a varying number of metrics across a stream of varying bandwidth by sacrificing quality.They're pretty simple because they're designed for sensor networks, but I think you could design similar structures with a dynamic instead of fixed value range, and variable size (prune nodes based on error threshold instead of or in addition to desired size).If anyone knows of a time-series system using something like this, I'd love to learn about it.

Xcelerateover 9 years ago

You know what I've always found really useful? Entire distributions. Forget means, medians, percentiles, etc. — give me the complete distribution (along with sample size) so I can understand all of the nuances of the data.(Better yet, just give me the raw data so I can analyze it myself. I find it hard to blindly trust someone else's conclusions considering all of the p-hacking going on nowadays.)

评论 #10702359 未加载

评论 #10702493 未加载

评论 #10702646 未加载

评论 #10703202 未加载

评论 #10706234 未加载

评论 #10703062 未加载

swk82over 9 years ago

I've recently written a page about "averaging" percentiles correctly by approximating the combined histograms of two distributions. This is demonstrated in a live time diagram with logarithmic time axis here:<a href="http://www.siegfried-kettlitz.de/blog/posts/2015/11/28/linlog_plot_quantiles/" rel="nofollow">http://www.siegfried-kettlitz.de/blog/posts/2015/11/28/linlo...</a>If you have questions or comments, feel free to reply or email.

nostromoover 9 years ago

A more accurate title would be something like: "Percentiles are great, but you may not be calculating them correctly."

torinmrover 9 years ago

Great article, I ran into this exact problem at work today. You can also get a very similar problem if the timeseries aggregation system you're using does any pre-aggregation before you calculate the percentile - for example, if you sample your servers every 90 seconds, then any latency number it reports is likely already averaged over the requests the server received during that time period, meaning your 99th percentile number is really the latency of the 99th percentile server, not the 99th percentile request. Using latency buckets solves this problem as well, however.

stdbrouwover 9 years ago

The article is useful because it outlines many different ways in which to monitor the performance of a system, many of which are better than just looking at the mean and P99. However, the main thesis that "an average of a percentile is meaningless" is just plain wrong. If the distribution is fixed, then averaging different P99 measurements will give you the best possible estimate of what P99 is in the population (as opposed to your sample.) If the distribution is moving (you're making performance improvements or your user base is growing) then a moving average of a percentile will move with it.

评论 #10702816 未加载

mgalkaover 9 years ago

Gave the post an upvote because it is interesting from a theoretical perspective, but I have a hard time imagining a real life scenario where averaging a 99 percentile will lead you to the wrong conclusion.Perhaps I'm wrong, but whenever I'm looking at the tail of a distribution, it's usually just to understand the order of magnitude, not to reach a precise number.

guidedlightover 9 years ago

The problem with percentiles is that you are discarding data points. e.g. Measuring the 98th percentile involves discarding the top 2% of data.The problem with that is, sometimes the top 2% you are discarding might correlate to your top 2% of customers...and you are literally throwing their data away by using percentiles. Not good.My recommendation is to pick two aggregation types. Maybe Percentile and Maximum, or Maximum and Mean, or Percentile and Mean. You can't really go wrong with that approach.

adamcathover 9 years ago

Does anyone know the math behind exactly how wrong the averaged percentiles are? My dim understanding of stats makes me think the central limit theorem is at play here; the averaged p99 values will tend towards a normal distribution, which is obviously wrong. Would love to be schooled on it.

kabousengover 9 years ago

I think the lesson is, don't just blindly calculate some numbers / metrics, have a look at your data (visually!) and see if it makes sense (for instance do you use the 99th / 95th / 90th percentile).

Lightbodyover 9 years ago

A day late, but I wanted to add: at New Relic (where I work) we ended up just deciding to store all the data points. We literally store every transaction and page view for our customers and then go back and read every data point when they are queried.

frikover 9 years ago

Why not let the user decide? Let them switch the graph between average, median, 95%, 99% and histogram. And everyone will be happy!

knughitover 9 years ago

Headline is utterly false.

评论 #10702255 未加载

13 comments

lotyrinover 9 years ago

Xcelerateover 9 years ago

评论 #10702359 未加载

评论 #10702493 未加载

评论 #10702646 未加载

评论 #10703202 未加载

评论 #10706234 未加载

评论 #10703062 未加载

swk82over 9 years ago

nostromoover 9 years ago

A more accurate title would be something like: "Percentiles are great, but you may not be calculating them correctly."

torinmrover 9 years ago

stdbrouwover 9 years ago

评论 #10702816 未加载

mgalkaover 9 years ago

guidedlightover 9 years ago

adamcathover 9 years ago

kabousengover 9 years ago

I think the lesson is, don't just blindly calculate some numbers / metrics, have a look at your data (visually!) and see if it makes sense (for instance do you use the 99th / 95th / 90th percentile).

Lightbodyover 9 years ago

frikover 9 years ago

Why not let the user decide? Let them switch the graph between average, median, 95%, 99% and histogram. And everyone will be happy!

knughitover 9 years ago

Headline is utterly false.

评论 #10702255 未加载

Why Percentiles Don’t Work the Way We Think

13 comments

Why Percentiles Don’t Work the Way We Think

13 comments