This is a great list to start with. Postgres logical replication is great for capturing changes, but the documentation is really lacking, and gives the impression that it was really just designed for Postgres -> Postgres replication, with a database administrator present to handle any edge cases.<p>Some others I've run into as part of developing a consumer for PowerSync:<p>1. LSN is strictly increasing for transaction commits, but not for row operations inside transactions. Specifically, when you have multiple concurrent transactions, they'll have overlapping LSN ranges for operations inside them.<p>2. When you restart replication, it always restarts at the beginning of a transaction. Combined with the above, there's not much of a point in trying to track progress within transactions.<p>3. Some hosting providers such as AWS RDS flushes the WAL every 5 minutes, causing an increase of 64MB each time. Make sure to acknowledge progress on "Primary keepalive messages", otherwise the WAL for an idle database will grow at over 750MB/hour. (Unlike what the article mentions, you don't need heartbeat messages as long as you acknowledge the primary keepalive messages.)<p>4. If you have a table without an unique index, you can replicate using REPLICA IDENTITY FULL or REPLICA IDENTITY NOTHING. But if you have duplicate rows in the source table, there will be no way to distinguish them with logical replication, and you can end up with inconsistent state.<p>DDL changes are also tricky. As the post mentions, you can detect columns added or removed using just the logical replication protocol. But there are other cases not covered:<p>1. You only get notifications of new tables once data has been inserted.<p>2. You get no notifications of dropped tables.<p>3. If a column is added with a default value, that default value is present on new rows, but you don't get any updates on existing rows.<p>Catering for initial snapshots is also not trivial:<p>1. Create a new logical replication slot using the logical replication protocol, with the "EXPORT_SNAPSHOT" option to create a "snapshot" of the state.<p>2. On a separate connection, start a repeatable read transaction, with `SET TRANSACTION SNAPSHOT '<snapshot>'` using the above snapshot.<p>3. Do the initial snapshot using this transaction.<p>4. Once the snapshot is complete, continue using logical replication.<p>Care must be taken that all data types match between the initial snapshot and logical replication. Also check things like columns matching up between the two (e.g. generated columns included in the initial snapshot, but not logical replication).<p>There are also a couple of cases where replication slots will break. I've seen a couple of cases triggered by the source database running out of disk space, despite the rest of the database having recovered in the meantime. Some examples:<p>1. I've seen a delay of hours without any messages being sent on the replication protocol, likely due to a large transaction in the WAL not committed when running out of disk space.<p>2. `PgError.58P02: could not create file \"pg_replslot/<slot_name>/state.tmp\": File exists`<p>2. `replication slot "..." is active for PID ...`, with some system process holding on to the replication slot.<p>3. `can no longer get changes from replication slot "<slot_name>". ... This slot has never previously reserved WAL, or it has been invalidated`.<p>Each one of those can be handled, typically by restarting postgres and/or deleting the replication slot. But there are many different edge cases that are difficult to know about and test for, before suddenly running into it in a production system. So generally it's very difficult to create a system that "just works", and can automatically recover without manual intervention in cases like those.