Another postgresql performance gotcha:<p>Find all the coupons that are expired (90 day expiration):<p><pre><code> SELECT * FROM coupon
WHERE created_at + INTERVAL '90 DAY' < now()
</code></pre>
This will not use the index on the "created_at" column and will be slow.<p>You should rewrite the inequality to:<p><pre><code> SELECT * FROM coupon
WHERE created_at < now() - INTERVAL '90 DAY'
</code></pre>
and now the query will be much faster. There are a lot of cases in postgres where simple equivalent algebraic manipulations can completely change the query plan
I'm not sure if it's as much a bottleneck in Postgres as it is in MySQL, but I've just written a tome[1] on more effective offset/limit pagination by using something called a deferred join.<p>The main problem with offset / limit pagination is that the database is forced to fetch and discard more and more rows as you get deeper into the pages.<p>The trick to getting offset / limit to be faster is to reduce the amount of data that the database has to inspect.<p>Using a "deferred join" defers the expensive data access until <i>after</i> the pagination has been calculated.<p>So instead of<p><pre><code> select * from contacts order by updated_at desc limit 15 offset 10000
</code></pre>
you'd do<p><pre><code> select * from contacts inner join (
select id from contacts order by updated_at desc limit 15 offset 10000
) as tmp using(id)
order by updated_at desc
</code></pre>
That way the inner query can use a covering index to pull everything it needs (in some cases) and the expensive selecting is done on fewer rows.<p>This works especially well when you intend to select a lot of data and when you're filtering / sorting on more than just the primary key.<p>[1] <a href="https://aaronfrancis.com/2022/efficient-pagination-using-deferred-joins" rel="nofollow">https://aaronfrancis.com/2022/efficient-pagination-using-def...</a>
Considering how easy the workaround for Ordering by NULLS LAST is (asking for ordered non-null values, then for null values) I wonder why postgres doesn't do that optimization internally. At first glance it seems like you could trivially rewrite any ORDER BY x NULLS LAST query that can't use an index into two queries that can use the index, and then UNION them together.<p>Maybe one of these things that are so easy to work around that nobody has prioritized fixing them.
Isn't a more common and general solution to the first problem to index the expression lower(email) (1).<p>I'll add some to the list<p>1.<p><pre><code> select count(*) from x where not exists (select 1 from y where id = http://x.id)
</code></pre>
can be thousands of times faster than<p><pre><code> select count(*) from x where id not in (select id from y)
</code></pre>
2.<p>This one is just weird, but I've seen (and was never able to figure out why):<p><pre><code> select x from table where id in (select id from cte) and date > $1
</code></pre>
be a lot slower than<p><pre><code> select x from table where id in (select id from cte limit (select count(*) from cte)) and date > $1
</code></pre>
3.<p>RDS is slow. I've seen select statements take 20 minutes on RDS which take a few seconds on _much_ cheaper baremetal.<p>4.<p>pg_stat_statements (2) is probably the single most useful thing you can enable/use<p>5.<p>If you're ok with potentially losing data on failure, consider setting synchronous_commit = off (3). You'll still be protected from data corruption and (4).<p>(1) - <a href="https://www.postgresql.org/docs/14/indexes-expressional.html" rel="nofollow">https://www.postgresql.org/docs/14/indexes-expressional.html</a><p>(2) - <a href="https://www.postgresql.org/docs/14/pgstatstatements.html" rel="nofollow">https://www.postgresql.org/docs/14/pgstatstatements.html</a><p>(3) - <a href="https://www.postgresql.org/docs/14/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT" rel="nofollow">https://www.postgresql.org/docs/14/runtime-config-wal.html#G...</a><p>(4) - <a href="https://www.postgresql.org/docs/14/wal-async-commit.html" rel="nofollow">https://www.postgresql.org/docs/14/wal-async-commit.html</a>
If your data allows so, use BRIN indexes. I have one case with append only data, around 1TB, 250GB date field btree that was converted to a BRIN of around 25MB
> Execution Time: 0.128 ms<p>This is fantastic. Wouldn't it be nice if that was the end of it and your business logic was running with the data after that exact delay?<p>It doesn't really detract from the value of query optimization (i.e. system load reduction), but I do find it a little contradictory in principle to push for sub-millisecond execution and then decide to send those results <i>to a completely different computer across a network</i>.<p>The biggest bottleneck we ever identified with any database vendor was the network stack sitting between database server and application server. I've got some SQLite-backed business services that can transact an entire user request (which involves <i>many</i> database interactions) and then already be returning a final HTTP response in the amount of time it takes PostgreSQL to complete this one example query.
Is anyone here an expert with MySQL, optimization and replication? Please contact me at greg <i>at</i> qbix,com and we’d engage you to help solve a couple issues.<p>Our site is experiencing slowdowns and SHOW PROCESSLIST may hold some clues. We have had 10,000 downloads of our apps worldwide and collect ping statistics in MySQL so that may have something to do with it