TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Index Merges vs. Composite Indexes in Postgres and MySQL

175 pointsby Sirupsenover 2 years ago

6 comments

magicalhippoover 2 years ago
A composite index can also be used for a partial index scan[1], so if you&#x27;re frequently looking for just int100 as well as the int100,int1000 combination (as per the article&#x27;s example), then a composite index int100,int1000 can be used for queries just filtering on int100.<p>The order of the columns in the composite index might matter then. We got some nice savings by reordering columns in indexes and changing the join order in queries (our DB wasn&#x27;t <i>that</i> smart) or adding a &quot;useless&quot; filters to the where clause, allowing us to consolidate multiple indexes into one composite one.<p>[1]: might be using the wrong terminology here, I&#x27;m not talking about a partial index[2], but using only the first N dimensions of a M-dimensional index.<p>[2]: <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;indexes-partial.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;indexes-partial.html</a>
评论 #33766949 未加载
aryndaover 2 years ago
Comparison on Clickhouse, also runs in about 30-40ms, however there&#x27;s no indexing being used and this is a full-table scan.<p><pre><code> create table if not exists test_table ( id UInt64, text1 String, text2 String, int1000 UInt64, int100 UInt64, int10 UInt64, int10_2 UInt64 ) engine = MergeTree() order by (id) ; insert into test_table with repeat(&#x27;b&#x27;, 1024) as one_kib, repeat(&#x27;b&#x27;, 255) as bytes_255 select number as id, one_kib, bytes_255, rand() % 1000 as int1000, rand() % 100 as int100, rand() % 10 as int10, rand() % 10 as int10_2 from numbers(10e6) ; &gt; select count(*) from test_table where int1000 = 1 and int100 = 1; ┌─count()─┐ │ 9949 │ └─────────┘ 1 row in set. Elapsed: 0.034 sec. Processed 10.00 million rows, 160.00 MB (290.93 million rows&#x2F;s., 4.65 GB&#x2F;s.) </code></pre> The same table but with 1B rows instead, runs in ~1800ms<p><pre><code> &gt; select count(*) from test_table where int1000 = 1 and int100 = 1; ┌─count()─┐ │ 999831 │ └─────────┘ 1 row in set. Elapsed: 1.804 sec. Processed 1.00 billion rows, 16.00 GB (554.24 million rows&#x2F;s., 8.87 GB&#x2F;s.) </code></pre> [1] Converted the table create and insert logic from here: <a href="https:&#x2F;&#x2F;github.com&#x2F;sirupsen&#x2F;napkin-math&#x2F;blob&#x2F;master&#x2F;newsletter&#x2F;20-compound-vs-combining-indexes&#x2F;test.rb" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sirupsen&#x2F;napkin-math&#x2F;blob&#x2F;master&#x2F;newslett...</a>
评论 #33769488 未加载
评论 #33770421 未加载
评论 #33769250 未加载
pdhborgesover 2 years ago
<p><pre><code> For these 64-bit index entries we’d expect to have to scan roughly: index_row_size⋅rows=2⋅64bit⋅10^5=1.5MiB </code></pre> Where do the 10^5 rows come from? With a composite index and a point query doesn&#x27;t the database scan just the 100 returned rows?
评论 #33767186 未加载
Lukas1994over 2 years ago
Good stuff! What&#x27;s the size difference between the composite index vs the two separate indices?
评论 #33766434 未加载
bawolffover 2 years ago
Honestly im kind of surprised that they are even that close. I wonder if this changes at scale when the intersection is larger.
评论 #33767402 未加载
scotty79over 2 years ago
Isn&#x27;t the cause of the difference between MySQL and PostgreSQL in this particular case is that COUNT is (was?) implemented weirdly in PostgreSQL and was always slower than in MySQL?<p>Can PostgreSQL respond to any query straight from index without touching the rows if the index contains all necessary fields?