I still can't figure out why people can't even come close to KDB+. It is a real conundrum. I've been waiting patiently for something to show up, but the gap seems to keep getting bigger instead of smaller.<p>Is it that people want to make the problem more complex that it needs to be? Is it that those who know most about these issues don't share their secrets so implemented from the outside often don't have a good understanding of how to do things properly? If you were to asked the guy behind Prometheus if he's looked at the commercial offerings and what he's learned from them, would even be able to speak about them intelligently?<p>There seems to be a huge skills gap on these things that I can't put my finger on. I'd love to be able to use a real TSDB, even at only half the speed and usefulness. It would be great for these smaller firms that cant or wont pay the license fees for a commercial offering until they get larger.
You may also want to check <a href="https://github.com/criteo/biggraphite/wiki/BigGraphite-Announcement" rel="nofollow">https://github.com/criteo/biggraphite/wiki/BigGraphite-Annou...</a> which is also about how to write a TSDB from Scatch but with different goals.
Exciting times in database land! It certainly seems like the good systems are converging on very similar storage architectures. This design is so similar to how Kafka and Kudu work internally.<p>As the raw storage seems pretty optimal now, I suspect next we'll see a comeback of indices for more precise queries to get another jump in performance.
The description of this new storage engine does not explain how it manages the durability of the data.<p>When you compare with the extreme efforts traditional databases take to ensure that unplugging a server will never ever result in data loss[0], silencing this problem makes me wonder.<p>Is it that at this ingest rate even trying to ensure durability is a vain effort?<p>[0] <a href="https://www.sqlite.org/atomiccommit.html" rel="nofollow">https://www.sqlite.org/atomiccommit.html</a>
I had a question about the following statement from the post:<p>>"Prometheus's storage layer has historically shown outstanding performance, where a single server is able to ingest up to one million samples per second as several million time series"<p>How are there one million samples per second equating to several million time series? Is a single sample not equivalent to a single data point in a time series db for a particular metric in Prometheus?