TechEcho

6 comments

A composite index can also be used for a partial index scan[1], so if you're frequently looking for just int100 as well as the int100,int1000 combination (as per the article's example), then a composite index int100,int1000 can be used for queries just filtering on int100.The order of the columns in the composite index might matter then. We got some nice savings by reordering columns in indexes and changing the join order in queries (our DB wasn't that smart) or adding a "useless" filters to the where clause, allowing us to consolidate multiple indexes into one composite one.[1]: might be using the wrong terminology here, I'm not talking about a partial index[2], but using only the first N dimensions of a M-dimensional index.[2]: <a href="https://www.postgresql.org/docs/current/indexes-partial.html" rel="nofollow">https://www.postgresql.org/docs/current/indexes-partial.html</a>

评论 #33766949 未加载

aryndaover 2 years ago

Comparison on Clickhouse, also runs in about 30-40ms, however there's no indexing being used and this is a full-table scan.<pre><code> create table if not exists test_table ( id UInt64, text1 String, text2 String, int1000 UInt64, int100 UInt64, int10 UInt64, int10_2 UInt64 ) engine = MergeTree() order by (id) ; insert into test_table with repeat('b', 1024) as one_kib, repeat('b', 255) as bytes_255 select number as id, one_kib, bytes_255, rand() % 1000 as int1000, rand() % 100 as int100, rand() % 10 as int10, rand() % 10 as int10_2 from numbers(10e6) ; > select count(*) from test_table where int1000 = 1 and int100 = 1; ┌─count()─┐ │ 9949 │ └─────────┘ 1 row in set. Elapsed: 0.034 sec. Processed 10.00 million rows, 160.00 MB (290.93 million rows/s., 4.65 GB/s.) </code></pre> The same table but with 1B rows instead, runs in ~1800ms<pre><code> > select count(*) from test_table where int1000 = 1 and int100 = 1; ┌─count()─┐ │ 999831 │ └─────────┘ 1 row in set. Elapsed: 1.804 sec. Processed 1.00 billion rows, 16.00 GB (554.24 million rows/s., 8.87 GB/s.) </code></pre> [1] Converted the table create and insert logic from here: <a href="https://github.com/sirupsen/napkin-math/blob/master/newsletter/20-compound-vs-combining-indexes/test.rb" rel="nofollow">https://github.com/sirupsen/napkin-math/blob/master/newslett...</a>

评论 #33769488 未加载

评论 #33770421 未加载

评论 #33769250 未加载

pdhborgesover 2 years ago

<pre><code> For these 64-bit index entries we’d expect to have to scan roughly: index_row_size⋅rows=2⋅64bit⋅10^5=1.5MiB </code></pre> Where do the 10^5 rows come from? With a composite index and a point query doesn't the database scan just the 100 returned rows?

评论 #33767186 未加载

Lukas1994over 2 years ago

Good stuff! What's the size difference between the composite index vs the two separate indices?

评论 #33766434 未加载

bawolffover 2 years ago

Honestly im kind of surprised that they are even that close. I wonder if this changes at scale when the intersection is larger.

评论 #33767402 未加载

scotty79over 2 years ago

Isn't the cause of the difference between MySQL and PostgreSQL in this particular case is that COUNT is (was?) implemented weirdly in PostgreSQL and was always slower than in MySQL?Can PostgreSQL respond to any query straight from index without touching the rows if the index contains all necessary fields?

6 comments

magicalhippoover 2 years ago

评论 #33766949 未加载

aryndaover 2 years ago

评论 #33769488 未加载

评论 #33770421 未加载

评论 #33769250 未加载

pdhborgesover 2 years ago

评论 #33767186 未加载

Lukas1994over 2 years ago

Good stuff! What's the size difference between the composite index vs the two separate indices?

评论 #33766434 未加载

bawolffover 2 years ago

Honestly im kind of surprised that they are even that close. I wonder if this changes at scale when the intersection is larger.

评论 #33767402 未加载

scotty79over 2 years ago

Index Merges vs. Composite Indexes in Postgres and MySQL

6 comments

Index Merges vs. Composite Indexes in Postgres and MySQL

6 comments