Every once in awhile I find an article that is so well written I feel like it walks me through a complex technical situation like a layman while also expanding my understanding of a tool I use everyday in a meaningful way. This is definitely one of those and I just want thank the author(s) for taking the time to write this up. It was interesting and enlightening.
> There was a long-running transaction, usually relating to PostgreSQL's autovacuuming, during the time. The stalls stopped quickly after the transaction ended.<p>What is about the other end?
Why does vacuum need to be a long running transaction and cannot be cut into shorter transactions ?
My take away is that they were testing the limits of PostgreSQL capabilities, and then reverted the change in a mad dash.<p>That this would have been an awesome opportunity for gitlab to show how OSS they are and fund a PostgreSQL developer to allow gitlab to design these boundary pushing designs.
Definitely a fun article.<p>This was the key takeaway for me.<p><pre><code> SubtransControlLock indicates that the query is waiting for PostgreSQL to load subtransaction data from disk into shared memory.
</code></pre>
I felt the article fell down for two reasons:
(1) It didn't really articulate the need for transactions in the first place (database integrity). Nor did it discuss the implications on integrity with this change.<p>(2) It didn't articulate the possibilities of other architectures (pushing to a read cache other than PostgresSQL like Cassandra).<p>I got the feeling they were really pushing PostgresSQL to its limits in their cluster with their load - and it was time to consider another design.