Author here. There's a lot I left out of the post, especially since it's such a young project, but I'd be happy to answer any questions.<p>Edit: I'm not sure why I'm being downvoted... I'm not home at the moment, so I'm trying my best to answer using my phone.<p>Edit #2: Back home with a full-size QWERTY keyboard :).
This is a pleasant, self-honest discussion of purpose-driven software as it is being implemented. The language is irrelevant; it's fun to read this type of piece.
Hi, Prometheus[0] author here. Thanks for the interesting article!<p>Since I was curious how this compares to Prometheus's internal storage for writes, I whipped up some (disclaimer: very naive and ad-hoc!) benchmarks[1] to get a rough feeling for Catena's performance. I am not achieving a lot of write performance with it yet, but maybe I'm doing something wrong or using it inefficiently. Some questions to investigate would be: what's the best number of rows to batch in one insert, and are timestamps in seconds, milliseconds, or essentially only user-interpreted (I noticed the partitioning at least depends heavily on the interval between timestamps)? So far I've just done a tiny bit of fiddling and results haven't changed dramatically.<p>The benchmark parameters:<p>* writing 10000 samples x 10000 metrics (100 million data points)<p>* initial state: empty storage<p>* source names: constant "testsource" for all time series<p>* metric names: "testmetric_<i>" (0 <= i < 10000)<p>* values: the metric index <i> (constant integer value within each series)<p>* timestamps: starting at 0 and increasing by 15 seconds for every iteration<p>* GOMAXPROCS=4 (4-core "Core i5-4690K" machine, 3.5GHz)<p>* Disk: SSD<p>* Other machine load: SoundCloud playing music in the background<p>The benchmark results:<p>#### Prometheus ####<p>(GOMAXPROCS=4 go run prometheus_bench.go -num-metrics=10000 -samples-per-metric=10000)<p>Time: 1m26s
Space: 138MB<p>#### Catena ####<p>(GOMAXPROCS=4 go run catena_bench.go -num-metrics=10000 -samples-per-metric=10000)<p>Time: 1h25m
Space: 190MB<p>So in this particular benchmark Catena took 60x longer and used 1.4x more space.<p>Please don't take this as discouragement or a statement on one being better than the other. Obviously Catena is very new and also probably optimized for slightly different use cases. And possibly I'm just doing something wrong (please tell me!). I also haven't dug into possible performance bottlenecks yet, but I saw it utilize 100% of all 4 CPU cores the entire time. In any case, I'd be interested in a set of benchmarks optimized specifically for Catena's use case.<p>Unfortunately we also haven't fully documented the internals of Prometheus's storage yet, but a bit of background information can be found here: <a href="http://prometheus.io/docs/operating/storage/" rel="nofollow">http://prometheus.io/docs/operating/storage/</a> Maybe that's worth a blog post sometime.<p>[0] <a href="http://prometheus.io/" rel="nofollow">http://prometheus.io/</a><p>[1] The code for the benchmarks is here: <a href="https://gist.github.com/juliusv/ce7c3b5368cd7adf8bc6" rel="nofollow">https://gist.github.com/juliusv/ce7c3b5368cd7adf8bc6</a>
"Catena" is also a password hashing function: <a href="http://eprint.iacr.org/2013/525.pdf" rel="nofollow">http://eprint.iacr.org/2013/525.pdf</a>