People interested in this might also want to check out simmer, which provides a simple unix stdin/stdout interface for this class of bounded summaries of unbounded streams. Quantiles are actually a weak point for simmer right now (I have a TODO item to add Q-Digest), but it has a bunch of other useful sketches implemented, via the Algebird library which I also hack on. Would love to get feedback & patches. <a href="https://github.com/avibryant/simmer" rel="nofollow">https://github.com/avibryant/simmer</a>
Hey, this is very close to something I've been needing recently (and in Go, nonetheless).<p>Is there any way to get a similar thing for a sliding window of a stream? For example, to be able to report (estimated) 90th percentile latencies for server requests in the last 5 minutes, hour, and day.
This is a really awesome problem! I tackled this for my work at TempoDB and ended up going with the Q-Digest algorithm although I took a good look at CKMS. Really cool to see this implements merging streams, I remember reading that CKMS was more difficult to merge streams than Q-Digest.<p>If anyone is interested this was my write up for algorithm selection: <a href="http://blog.tempo-db.com/post/42318820124/estimating-percentiles-on-streams-of-data" rel="nofollow">http://blog.tempo-db.com/post/42318820124/estimating-percent...</a>
I don't really understand what this does, I didn't find a "for dummies" section on the site - can anyone give a real-world use case and (ideally) a comparison with some other system that does the same thing?