We launched QuestDB last summer [1, 2]. Our storage model is vector-based and append-only. This meant that all incoming data had to arrive in the correct time order. This worked well for some use cases but we increasingly saw real-world cases where data doesn't always land at the database in chronological order. We saw plenty of developers and users come and go specifically because of this technical limitation. So it became a priority to deal with out-of-order data.<p>The big decision was which direction to take to tackle the problem. LSM trees seemed an obvious choice, but we chose an alternative route so we wouldn't lose the performance we spent years building. Our latest release supports out-of-order ingestion by re-ordering data on the fly. That's what this article is about.<p>Also, we had many people asking about the differences between QuestDB and other open-source databases and why users should consider giving it a try instead of other systems. When we launched on HN, readers showed a lot of interest in side-by-side comparisons to other databases on the market. One suggestion [3] that we thought would be great to try out was to benchmark ingestion and query speeds using the Time Series Benchmark Suite (TSBS) [4] developed by TimescaleDB. We're super excited to share the results in the article.<p>[1] <a href="https://news.ycombinator.com/item?id=23975807" rel="nofollow">https://news.ycombinator.com/item?id=23975807</a><p>[2] <a href="https://news.ycombinator.com/item?id=23616878" rel="nofollow">https://news.ycombinator.com/item?id=23616878</a><p>[3] <a href="https://news.ycombinator.com/item?id=23977183" rel="nofollow">https://news.ycombinator.com/item?id=23977183</a><p>[4] <a href="https://github.com/timescale/tsbs" rel="nofollow">https://github.com/timescale/tsbs</a>
Is anyone using QuestDB in a memory constrained scenario? I've been running InfluxDB 2.0 on a 1GB RAM VPS (+2GB swap, currently collecting data every 5s but would prefer more often) for monitoring my home network, and it gets OOM killed if I try and view more than one day's worth of relatively simple data (ping, jitter, upload/download usage, cake sqm latencies, cpu usage), so I've been looking to replace it with something with a smaller memory footprint.<p>Most of the DB benchmarks I've seen are missing memory usage, which imo matters more for hobbyist/small scale users who are fine with paying $0-$10/month, but would rather not pay ~$30-40/month for the 8GB RAM minimum a lot of time series DBs seem to want.
Congrats on the release! The benchmark results look really impressive :).<p>Curious to learn more about your approach to verifying the correctness of the implementation. Did you try testing it with Jepsen or something similar?
Excited to see this new release. Seems to me this would (slightly?) negatively impact query performance for recent data (when the query concerns data is both in O3 and persisted zones), is that the case?