TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The surprising impact of medium-size texts on PostgreSQL performance

182 点作者 haki超过 4 年前

7 条评论

malisper超过 4 年前
I&#x27;ve mentioned this story here before, but one of the most surprising performance gains I saw was by eliminating TOAST look ups. If I recall correctly, each time you use the `-&gt;&gt;` operator on a TOASTed JSONb column, the column will be deTOASTed. That means if you write a query like:<p><pre><code> SELECT x -&gt;&gt; &#x27;field1&#x27;, x -&gt;&gt; &#x27;field2&#x27;, x -&gt;&gt; &#x27;field3&#x27; FROM table </code></pre> and x is TOASTed, Postgres will deTOAST x three different times. This multiplies the amount of data that needs to be processed and dramatically slows things down.<p>My first attempt to fix this was to read the field in one query and use a subselect to pull out the individual fields. This attempt was thwarted by the Postgres optimizer which inlined the subquery and still resulted in deTOASTing the field multiple times.<p>After a discussion with the Postgres IRC, RhodiumToad pointed out that if I add OFFSET 0 to the end of the subquery, that will prevent Postgres from inlining it. After retrying that, I saw an order of magnitude improvement due to eliminating the redundant work.
评论 #24839954 未加载
评论 #24844098 未加载
评论 #24840131 未加载
michelpp超过 4 年前
The article only briefly touches on this, but you can control various aspects of how postgres decides to store your value using `ALTER TABLE ... SET STORAGE` including the ability to make it inline, uncompressed, compressed, or TOASTED and also to change the threshold parameters to when postgres decides it for you. The documentation is about a 1&#x2F;3 of the way down this page:<p><a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;13&#x2F;sql-altertable.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;13&#x2F;sql-altertable.html</a><p>This is useful if you have, for example, short character codes coming from some external system. The default will be EXTENDED (external, compressed) but in order to make the absolutely smallest stored tuple possible, if you know it&#x27;s a short or fixed length, you can go MAIN (internal, compressed) or PLAIN (internal, uncompressed).
ars超过 4 年前
This is why most databases have separate char&#x2F;varchar and text types: char and varchar are stored inline, and text is stored externally.<p>PostgreSQL doesn&#x27;t do this, there is no difference between any of the character types, they are all stored the same way, and the type only serves to validate the data. See: <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;datatype-character.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;current&#x2F;datatype-character.h...</a><p>MySQL even has TINYTEXT for when you want to store small strings outside the table (for performance). With tinytext only a single byte is stored in the table, and the rest externally, so I&#x2F;O is reduced when doing a full table scan, if you don&#x27;t need to read the tinytext column.
评论 #24841173 未加载
ainar-g超过 4 年前
For those wondering, “TOAST” actually stands for “The Oversized-Attribute Storage Technique”. See <a href="https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;13&#x2F;storage-toast.html" rel="nofollow">https:&#x2F;&#x2F;www.postgresql.org&#x2F;docs&#x2F;13&#x2F;storage-toast.html</a>. The link also provides some implementation details.
dspillett超过 4 年前
There can be significant impact from off-page types (VARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX), and the deprecated TEXT&#x2F;NTEXT&#x2F;BINARY) in SQL Server, and presumably other DBMs, too.<p>Though one key difference is that SQL Server doesn&#x27;t compress the off-page parts as this article states postgres does. In fact even if you have the table set to compress using either row or page compression option, off-page data is not compressed.
评论 #24838499 未加载
outworlder超过 4 年前
One thing to keep in mind is that TOAST tables also require vacuum. As any other table, they will also count towards the TXID limits. Make sure to tune the DB accordingly if you have too many of those and the DB is busy with writes.
holstvoogd超过 4 年前
Very interesting! I&#x27;ve spend many moons optimizing postgresql configs and queries, never ran across this before :)<p>Pretty sure I have some targets to try this on even