TechEcho

1 comment

I'm curios if this is a fair benchmark (considering it was conducted by one of the vendors).So I downloaded the mentioned whitepaper to look at the indexes they created in PostgreSQL.The only thing mentioned in the whitepaper is this table definition:<pre><code> CREATE TABLE IF NOT EXISTS t1 ( "uuid" VARCHAR, "ts" TIMESTAMP, "tenant_id" INT4, "sensor_id" VARCHAR, "sensor_type" VARCHAR, "v1" INT4, "v2" INT4, "v3" FLOAT4, "v4" FLOAT4, "v5" BOOL, "week_generated" TIMESTAMP, "taxonomy" ltree ); </code></pre> The table definition in Crate is a little different:<pre><code> CREATE TABLE IF NOT EXISTS b.t2 ( "uuid" STRING, "ts" TIMESTAMP, "tenant_id" INTEGER, "sensor_id" STRING, "sensor_type" STRING, "v1" INTEGER, "v2" INTEGER, "v3" FLOAT, "v4" FLOAT, "v5" BOOLEAN, "week_generated" TIMESTAMP GENERATED ALWAYS AS date_trunc('week', ts), INDEX "taxonomy" USING FULLTEXT (sensor_type) WITH (analyzer='tree') ) PARTITIONED BY ("week_generated") CLUSTERED BY ("tenant_id") INTO 3 SHARDS; </code></pre> I don't know anything about how Crate works, but I see PARTITIONED BY ("week_generated") and CLUSTERED BY ("tenant_id") INTO 3 SHARDS.Now looking at the first query, both, their PARTITION BY as well as CLUSTER BY columns appear in the query:<pre><code> SELECT min(v1) as v1_min, max(v1) as v1_max, avg(v1) as v1_avg, sum(v1) as v1_sum FROM b.t2 WHERE tenant_id = ? AND week_generated BETWEEN ? AND ?; </code></pre> Unless there is a sufficiently good index used in PostgreSQL, this doesn't seem to be a fair benchmark.In the Appendix they mention how the partitioning could be implemented in PostgreSQL, also mention the indexes to be used together with the partitioning, but also say: "Because of this, we abandoned the partitioning approach for our benchmark, instead opting for a single PostgreSQL table with no partition logic."I didn't find any statement about if/which indexes they were using in PostgreSQL when measuring their benchmarks.They also use a rather old PostgreSQL version (9.2 — released in 2012). Now, we have PostgreSQL 9.6 which has some parallel query execution support that might also change this figures dramatically.

评论 #14275066 未加载

1 comment

MarkusWinandabout 8 years ago

评论 #14275066 未加载

“SQL queries ran up to 29 times faster on CrateDB than they did on PostgreSQL”

1 comment

“SQL queries ran up to 29 times faster on CrateDB than they did on PostgreSQL”

1 comment