I am architecting a project that is at a smaller scale but bears some resemblance, and I’ve already settled that I will be very careful about GET vs POST/PUT semantics and route all traffic of the former only to the replicas and all of the latter through a connection pool to the primary. That, I expect, will hold me well into >1% of their traffic.<p>I’m not sure if I’ve missed something with their LSN work or if this indicates that the GET semantics horse has already escaped their proverbial barn. Within the call-response of a single request, which they seem to be talking about, none of this should be necessary. Right?<p>However a cascade of narrowly-spaced follow-up requests could easily catch you in this trick. Lie. Whatever you wish to call it.
Bitbucket is a perfect case for sharding. You could have one DB of users, and N DBs of user data, with the first referencing the latter, and then distribute user data among the latter. You could do this straight up with PostgreSQL without any special server-side software. You could also have a PG server using FDW to act as a proxy for all the DBs a client needs to be talking to. There are many other options too.
Bitbucket (mostly via Atalassian) worked very hard to chase everyone off their platform in the vain hopes that the OSS community would start using Jira. Clearly they bet wrong, but unfortunately they aren't going to get any users back for a very long time.
Is it just me or does looking at write logs to determine read consistency just seem like a really bad way to do this vs a different load balancing strategy?
Curious that they use LSN tracking vs making the replication synchronous. I would be curious to see the performance numbers and the reasoning behind the choice.
Oracle offers DML redirection for active dataguard setup where a write operation on the replica database is redirected to the primary database to allow applications that make infrequent writes to actively run on the Active Data Guard replica database. Also the write operation completes when the replica has seen the write from the primary thereby eliminating the race condition avoiding such complex reengineering
> we find a replica that is as up to date as the user's saved LSN<p>The article doesn't explain how they know the current LSN of each replica?