Show HN: Query 1.6B rows in milliseconds, live

271 pointsby bluestreakalmost 5 years ago

25 comments

bluestreakalmost 5 years ago

Author here.A few weeks ago, we wrote about how we implemented SIMD instructions to aggregate a billion rows in milliseconds [1] thanks in great part to Agner Fog’s VCL library [2]. Although the initial scope was limited to table-wide aggregates into a unique scalar value, this was a first step towards very promising results on more complex aggregations. With the latest release of QuestDB, we are extending this level of performance to key-based aggregations.To do this, we implemented Google’s fast hash table aka “Swisstable” [3] which can be found in the Abseil library [4]. In all modesty, we also found room to slightly accelerate it for our use case. Our version of Swisstable is dubbed “rosti”, after the traditional Swiss dish [5]. There were also a number of improvements thanks to techniques suggested by the community such as prefetch (which interestingly turned out to have no effect in the map code itself) [6]. Besides C++, we used our very own queue system written in Java to parallelise the execution [7].The results are remarkable: millisecond latency on keyed aggregations that span over billions of rows.We thought it could be a good occasion to show our progress by making this latest release available to try online with a pre-loaded dataset. It runs on an AWS instance using 23 threads. The data is stored on disk and includes a 1.6billion row NYC taxi dataset, 10 years of weather data with around 30-minute resolution and weekly gas prices over the last decade. The instance is located in London, so folks outside of Europe may experience different network latencies. The server-side time is reported as “Execute”.We provide sample queries to get started, but you are encouraged to modify them. However, please be aware that not every type of query is fast yet. Some are still running under an old single-threaded model. If you find one of these, you’ll know: it will take minutes instead of milliseconds. But bear with us, this is just a matter of time before we make these instantaneous as well. Next in our crosshairs is time-bucket aggregations using the SAMPLE BY clause.If you are interested in checking out how we did this, our code is available open-source [8]. We look forward to receiving your feedback on our work so far. Even better, we would love to hear more ideas to further improve performance. Even after decades in high performance computing, we are still learning something new every day.[1] <a href="https://questdb.io/blog/2020/04/02/using-simd-to-aggregate-billions-of-rows-per-second" rel="nofollow">https://questdb.io/blog/2020/04/02/using-simd-to-aggregate-b...</a>[2] <a href="https://www.agner.org/optimize/vectorclass.pdf" rel="nofollow">https://www.agner.org/optimize/vectorclass.pdf</a>[3] <a href="https://www.youtube.com/watch?v=ncHmEUmJZf4" rel="nofollow">https://www.youtube.com/watch?v=ncHmEUmJZf4</a>[4] <a href="https://github.com/abseil/abseil-cpp" rel="nofollow">https://github.com/abseil/abseil-cpp</a>[5] <a href="https://github.com/questdb/questdb/blob/master/core/src/main/c/share/rosti.h" rel="nofollow">https://github.com/questdb/questdb/blob/master/core/src/main...</a>[6] <a href="https://github.com/questdb/questdb/blob/master/core/src/main/c/share/vec_agg.cpp#L155" rel="nofollow">https://github.com/questdb/questdb/blob/master/core/src/main...</a>[7] <a href="https://questdb.io/blog/2020/03/15/interthread" rel="nofollow">https://questdb.io/blog/2020/03/15/interthread</a>[8] <a href="https://github.com/questdb/questdb" rel="nofollow">https://github.com/questdb/questdb</a>

评论 #23617196 未加载

评论 #23625404 未加载

评论 #23625770 未加载

评论 #23628369 未加载

sa46almost 5 years ago

How timely! I've done a deep dive into column store databases for the past couple of weeks. Reading through the Quest docs, I'd give it the following characteristics. Are these accurate?- single node database, not [yet] distributed- primary focus is time-series data, specifically in-order time series data (the `designated timestamp` extension)- physical data layout is an append-only column store- Implements a small subset of SQL with some affordances for time series (LATEST BY, SAMPLE BY).- Doesn't support explicit GROUP BY or HAVING clauses. Instead, questdb implicitly assumes GROUP BY or HAVING based on presence of aggregation functions in the select clause.- Small standard library of functions: only 4 text functions.Based on these characteristics it seems the quest db is well positioned against Influx. It's probably faster than Timescale DB but significantly less flexible given that Timescale has all of Postgres behind it. Quest DB might eventually compete with clickhouse but it's long ways out given that it's not distributed and implements a much smaller subset of SQL.I'd love to get any insight into join performance. Quite a few column stores handle large joins poorly (clickhouse, druid).

评论 #23620005 未加载

评论 #23629821 未加载

评论 #23619690 未加载

评论 #23621954 未加载

dvnguyenalmost 5 years ago

Ask HN: what’s the market for a new Database? Who do you build it for and how do you sell it? Asking as an uninformed users who haven’t needed a niche database.

评论 #23623243 未加载

评论 #23621152 未加载

评论 #23622249 未加载

twoodfinalmost 5 years ago

Very cool. Major props for putting this out there and accepting arbitrary queries.Couple comments / questions:- Correct that there’s no GROUP BY support?- EXPLAIN or similar would be nice, both to get a peek at how your engine works & to anticipate whether a query is using parallel / SIMD execution and will take milliseconds vs. minutes.

评论 #23620673 未加载

评论 #23620773 未加载

calibasalmost 5 years ago

I abused LEFT JOIN to create a query that produces 224,964,999,650,251 rows. Only 3.68ms execution time, now that's impressive!

评论 #23643664 未加载

评论 #23626333 未加载

gregwebsalmost 5 years ago

This seems very similar to Victoria Metrics. Victoria Metrics is very much based on the design of Clickhouse and currently shows best of class performance numbers for time series data: it would be a lot more interesting to see a comparison to Victoria Metrics than ClickHouse (which is not fully optimized for time series). Victoria Metrics is Prometheus compatible whereas Quest now supports Postgres compatibility. Both have compatibility with InfluxDB.

dnadleralmost 5 years ago

Im not much of a database expert by an stretch, but this query took about 80 seconds, which seems like quite a long time, but maybe its more complicated than it appears? My understanding is that the group by is handled automatically, and the results seem to support that:select cab_type, payment_type, count() from trips;

评论 #23628092 未加载

luxalmost 5 years ago

I was looking for dev/production specs for QuestDB the other day and didn't see them in the docs. Being that it's in Java, which can be quite memory-hungry, what's the minimum and recommended RAM/CPU required to run it?

评论 #23618825 未加载

gubbyalmost 5 years ago

FYI, this query took 81 seconds:select count(*) from trips where fare_amount > 0Presumably this is Hacker News load?

评论 #23620909 未加载

评论 #23620905 未加载

pachicoalmost 5 years ago

It looks very proming, congrats. I use ClickHouse in production and I'd love to see how this project evolves. My main disappointment is the amount of aggregation functions: <a href="https://questdb.io/docs/functionsAggregation" rel="nofollow">https://questdb.io/docs/functionsAggregation</a> Clickhouse provides hundreds of functions, many of which I use. It would be hard to even consider QuestDB with this amount of functions. I'll stay tuned, anyway. Keep up the good work!

评论 #23642864 未加载

lykr0nalmost 5 years ago

`select distinct pickup_latitude, count(*) from trips` takes 241.37s.But some of the more top level queries are quite fast.

评论 #23619465 未加载

Bedon292almost 5 years ago

Any plans for geo support? Looks like you have lat/lon stored as two independent doubles, which does not lend itself well to any sort of geo operations.

评论 #23620326 未加载

评论 #23620286 未加载

keshavmralmost 5 years ago

SELECT vendor_id, cab_type, avg(fare_amount) from trips;This takes ~86 seconds. Ran it multiple times.SIMD is one of the ingredients for better query performance, but NOT THE ONLY ONE. See this for more info: <a href="https://www.youtube.com/watch?v=xJd8M-fbMI0" rel="nofollow">https://www.youtube.com/watch?v=xJd8M-fbMI0</a>

timbowhitealmost 5 years ago

Just a heads up - I tried this random query:<pre><code> select * from trips where pickup_latitude < -74 order by pickup_longitude desc limit 10; </code></pre> Didn't get a result set. The little box in the upper right hand corner stated "Maximum number of pages (16) breached in MemoryPages"

评论 #23620881 未加载

评论 #23624574 未加载

strikelaserclawalmost 5 years ago

Very impressive, i think building your own (performant) database from scratch is one of the most impressive software engineering feats. Can you let me know a little bit of how an interested person with a cs background can approach this daunting topic?

评论 #23622677 未加载

heybrendanalmost 5 years ago

Could you take a moment to comment on why one would choose to use QuestDB in lieu of Google's Bigtable or BigQuery? Furthermore, how does this compare to other proprietary and open source solutions currently available in the marketplace? In short, I'm struggling to understand where this technology fits and the over all value proposition.Asked a different way: If I were a lead of IT, director-level executive, CTO, etc. how and why should I begin to evaluate the QuestDB service / solution?Thank you for sharing, bluestreak and team, very interesting and exciting work!

评论 #23622345 未加载

tpayetalmost 5 years ago

Cool project :) How would you compare QuestDB against TimescaleDB?

评论 #23619561 未加载

jinmingjianalmost 5 years ago

sharing some thoughts here, in that I am recently developing a similar thing:1. "Query 1.6B rows in milliseconds, live" is just like "sum 1.6B numbers from memory in ms".In fact, if not full SQL functionalities supported, a naive SQL query is just some tight loop on top of arrays(as partitions for naive data parallelism) and multi-core processors.So, this kind is just several-line benchmark(assumed to ignore the data preparing and threading wrapping) to see how much time the sum loop can finish.In fact again, this is just a naive memory bandwidth bench code.Let's count: now the 6-channel xeon-sp can provide ~120GB/s bandwidth. Then sum loop with 1.6B 4-byte ints without compression in such processors' memory could be finished about ~1.6*4/120 ~= 50ms.Then, if you find that you get 200ms in xxx db, you in fact has wasted 75% time(150ms) in other things than your own brew a small c program for such toy analysis.2. Some readers like to see comparisons to ClickHouse(referred as CH below).The fact is that, CH is a little slow for such naive cases here(seen at web[1] been pointed by guys).This is because CH is a real world product. All optimizations here are ten- year research and usage in database industry and all included in CH and much much more.Can you hold such statement in the title when you enable reading from persistent disk? or when doing a high-cardinality aggregation in the query(image that low-cardinality aggregation is like as a tight loop + hash table in L2)?[1] <a href="https://tech.marksblogg.com/benchmarks.html" rel="nofollow">https://tech.marksblogg.com/benchmarks.html</a>

评论 #23626109 未加载

yumrajalmost 5 years ago

A peek into the hardware specs behind this: CPU, memory, storage, network would be nice.

评论 #23621490 未加载

yingw787almost 5 years ago

Very cool and impressive!! Is full PostreSQL wire compatibility on the roadmap? I like postgres compatibility :)

评论 #23625927 未加载

greatNespressoalmost 5 years ago

Mind blowing, did not know about questDB. The back button seems broken on chrome mobile.

评论 #23625605 未加载

RedShift1almost 5 years ago

Will we be able to put data from the past into it at some point?

评论 #23625671 未加载

LordOfWolvesalmost 5 years ago

How nice of you to hijack the back-button (to return to HN after some sample queries) despite that being how myself & others found this in the first place.I was really liking the product up until that point.

评论 #23620713 未加载

评论 #23620733 未加载

评论 #23620728 未加载

评论 #23620754 未加载

patelhalmost 5 years ago

Any comparisons to Druid or Pinot?

woodgrainzalmost 5 years ago

I don't know anything about this database, but this link did break the browser <back> button. Perhaps you can look into fixing that in a future release.

评论 #23619540 未加载

评论 #23619220 未加载