TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

PostgreSQL query performance bottlenecks

249 pointsby pawurbover 3 years ago

8 comments

throwaway858over 3 years ago
Another postgresql performance gotcha:<p>Find all the coupons that are expired (90 day expiration):<p><pre><code> SELECT * FROM coupon WHERE created_at + INTERVAL &#x27;90 DAY&#x27; &lt; now() </code></pre> This will not use the index on the &quot;created_at&quot; column and will be slow.<p>You should rewrite the inequality to:<p><pre><code> SELECT * FROM coupon WHERE created_at &lt; now() - INTERVAL &#x27;90 DAY&#x27; </code></pre> and now the query will be much faster. There are a lot of cases in postgres where simple equivalent algebraic manipulations can completely change the query plan
评论 #29979586 未加载
评论 #29984875 未加载
评论 #29980568 未加载
评论 #29979631 未加载
aarondfover 3 years ago
I&#x27;m not sure if it&#x27;s as much a bottleneck in Postgres as it is in MySQL, but I&#x27;ve just written a tome[1] on more effective offset&#x2F;limit pagination by using something called a deferred join.<p>The main problem with offset &#x2F; limit pagination is that the database is forced to fetch and discard more and more rows as you get deeper into the pages.<p>The trick to getting offset &#x2F; limit to be faster is to reduce the amount of data that the database has to inspect.<p>Using a &quot;deferred join&quot; defers the expensive data access until <i>after</i> the pagination has been calculated.<p>So instead of<p><pre><code> select * from contacts order by updated_at desc limit 15 offset 10000 </code></pre> you&#x27;d do<p><pre><code> select * from contacts inner join ( select id from contacts order by updated_at desc limit 15 offset 10000 ) as tmp using(id) order by updated_at desc </code></pre> That way the inner query can use a covering index to pull everything it needs (in some cases) and the expensive selecting is done on fewer rows.<p>This works especially well when you intend to select a lot of data and when you&#x27;re filtering &#x2F; sorting on more than just the primary key.<p>[1] <a href="https:&#x2F;&#x2F;aaronfrancis.com&#x2F;2022&#x2F;efficient-pagination-using-deferred-joins" rel="nofollow">https:&#x2F;&#x2F;aaronfrancis.com&#x2F;2022&#x2F;efficient-pagination-using-def...</a>
评论 #29982149 未加载
wongarsuover 3 years ago
Considering how easy the workaround for Ordering by NULLS LAST is (asking for ordered non-null values, then for null values) I wonder why postgres doesn&#x27;t do that optimization internally. At first glance it seems like you could trivially rewrite any ORDER BY x NULLS LAST query that can&#x27;t use an index into two queries that can use the index, and then UNION them together.<p>Maybe one of these things that are so easy to work around that nobody has prioritized fixing them.
评论 #29979888 未加载
评论 #29978660 未加载
latchover 3 years ago
Isn&#x27;t a more common and general solution to the first problem to index the expression lower(email) (1).<p>I&#x27;ll add some to the list<p>1.<p><pre><code> select count(*) from x where not exists (select 1 from y where id = http:&#x2F;&#x2F;x.id) </code></pre> can be thousands of times faster than<p><pre><code> select count(*) from x where id not in (select id from y) </code></pre> 2.<p>This one is just weird, but I&#x27;ve seen (and was never able to figure out why):<p><pre><code> select x from table where id in (select id from cte) and date &gt; $1 </code></pre> be a lot slower than<p><pre><code> select x from table where id in (select id from cte limit (select count(*) from cte)) and date &gt; $1 </code></pre> 3.<p>RDS is slow. I&#x27;ve seen select statements take 20 minutes on RDS which take a few seconds on _much_ cheaper baremetal.<p>4.<p>pg_stat_statements (2) is probably the single most useful thing you can enable&#x2F;use<p>5.<p>If you&#x27;re ok with potentially losing data on failure, consider setting synchronous_commit = off (3). You&#x27;ll still be protected from data corruption and (4).<p>(1) - <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;indexes-expressional.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;indexes-expressional.html</a><p>(2) - <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;pgstatstatements.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;pgstatstatements.html</a><p>(3) - <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;runtime-config-wal.html#G...</a><p>(4) - <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;wal-async-commit.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;14&#x2F;wal-async-commit.html</a>
评论 #29979929 未加载
评论 #29980746 未加载
评论 #29982787 未加载
itsthecourierover 3 years ago
If your data allows so, use BRIN indexes. I have one case with append only data, around 1TB, 250GB date field btree that was converted to a BRIN of around 25MB
laurent123456over 3 years ago
Why was the title changed? It now seems like a general article about query performance, while it&#x27;s really just about 5 specific cases.
评论 #29979971 未加载
bob1029over 3 years ago
&gt; Execution Time: 0.128 ms<p>This is fantastic. Wouldn&#x27;t it be nice if that was the end of it and your business logic was running with the data after that exact delay?<p>It doesn&#x27;t really detract from the value of query optimization (i.e. system load reduction), but I do find it a little contradictory in principle to push for sub-millisecond execution and then decide to send those results <i>to a completely different computer across a network</i>.<p>The biggest bottleneck we ever identified with any database vendor was the network stack sitting between database server and application server. I&#x27;ve got some SQLite-backed business services that can transact an entire user request (which involves <i>many</i> database interactions) and then already be returning a final HTTP response in the amount of time it takes PostgreSQL to complete this one example query.
评论 #29982737 未加载
评论 #29981366 未加载
评论 #29981636 未加载
EGregover 3 years ago
Is anyone here an expert with MySQL, optimization and replication? Please contact me at greg <i>at</i> qbix,com and we’d engage you to help solve a couple issues.<p>Our site is experiencing slowdowns and SHOW PROCESSLIST may hold some clues. We have had 10,000 downloads of our apps worldwide and collect ping statistics in MySQL so that may have something to do with it