We got hit with a good bit of the postgres iceberg this week.<p>VACUUM by default runs with the TRUNCATE option. This will, if possible and valuable, truncate the heap and return those pages to the operating system. This operation takes an AccessExclusive lock, which blocks reads and writes. It is written to be very deferential and only runs if nothing else is trying to take a competing lock.<p>AccessExclusive is a unique lock in that it is replicated to standbys. When the writer holds the lock, reads on the standbys will be blocked. However, the writer doesn't know about them, so its conservative approach of only acquiring the lock if there's no contention is completely thwarted.<p>Finally, if vacuum truncate is interrupted by lock contention it must repeat a scan of the heap to ensure the blocks are still empty before proceeding with the truncation.<p>All of these details combined meant our writer got stuck in a "grab lock, scan, be interrupted, release lock, grab lock, restart scan, be interrupted, release lock... etc" loop. The replication of this lock to the reader seemed to have batched together, so the readers never got to run in between the loop iterations. The vacuum never actually succeeded because it kept re-scanning. We had to intervene manually to disable autovacuum and cancel the in-progress jobs.<p>We've hit plenty of weird postgres issues, but this was a new one for us.<p>Edit: of note is the somewhat rare lack of clarity in the postgres docs. "VACUUM (without FULL)" is documented as only requiring ShareUpdateExclusive, which is why autovacuum is considered safe to run. Turns out that's not true.
Heh, this is pretty great. Is it good or bad that I know almost all of these?<p>Not sure "SERIAL is non-transactional" belongs at the bottom, I thought that was common knowledge... but that might be from years doing Oracle where sequences are more explicit.
This is a pretty awesome way of summarising a topic. Perhaps an advanced version could offer an ELI5 explanation on hoverover. I'm in the first or second layer of the postgres version of the meme, yet would gladly spend hours exploring layers I don't yet know about (simply reading the names of the ideas in each layer was incredibly interesting tbh).
I struggle to understand why the Halloween Problem is listed as a Postgres issue. The concept that a where condition would run until all rows are condition false is just categorically false for Postgresql.<p><a href="https://en.wikipedia.org/wiki/Halloween_Problem" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Halloween_Problem</a>
found another youtube video that explain the seme meme<p><a href="https://youtu.be/JZRWkfXNQOk" rel="nofollow noreferrer">https://youtu.be/JZRWkfXNQOk</a>