People need to stop writing stuff about distributed systems if they won’t explicitly talk about the P in CAP theorem.<p>What happens here in the event of a network partition? There are several failure modes possible if I’m reading the diagram right. Network failure between the Aurora node and the shared storage layer.<p>I would expect that to result in a failed transaction and rollback, right? That’s not really all that available though, is it?<p>What about a network partition between availability zones? Maybe your local thinks it has a quorum because it can no longer see a different AZ? What happens when the partition is healed? How do the nodes reconcile? What if you’re working with multiple AZs and you have application nodes in each AZ writing to a cluster of nodes that all think they have a quorum because they can’t see each other?<p>What happens to those writes after the partition? Is every transaction terminated rolled back because us-east-1 can’t see us-west-3?<p>If they aren’t, how do the nodes achieve consensus? Are we doing a mongoDB thing here and just tossing everything after the last shared state? And even if that does work out,<p>What’s going on? And how is it achieved.<p>Look, if you need transactions at all, you most likely need isolation level: serializable. And that’s incredibly hard to get right.<p>Many systems really don’t need any of that at all, but when you need it, it has to be right.<p>This doesn’t address any of the really hardest parts of distributed systems. It doesn’t address why I would take any application at all that stores data in a RDBMS and use this.<p>I’m sure everyone there on that team is working hard. But people have to stop writing about distributed systems like this.