Author here. We've spent the past year adding read committed isolation to CockroachDB.<p>There were many interesting design decisions, such as:<p>- whether to use multiple snapshots or a single snapshot per statement<p>- how to handle read uncertainty intervals<p>- how to incorporate SELECT FOR UPDATE locking into Raft<p>- how to handle SELECT FOR UPDATE subqueries<p>- how to prevent lost update anomalies between two UPDATEs<p>Some of the gory details are in the public RFC: <a href="https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20230122_read_committed_isolation.md">https://github.com/cockroachdb/cockroach/blob/master/docs/RF...</a><p>This blog post just discusses the last point, but please AMA.
It's often talked about how new sql databases offer better scalability than standard SQL databases, but I think it's maybe sometimes underappreciated how (some, not all) of them are also much simpler in terms of their consistency models.<p>I'd speculate this is because postgres and friends try to eek out every bit of single node performance (which helps with single row throughout and overall throughout, which is obviously much better for them than newsql) but the scalability of new SQL databases might allow them to prefer easy consistency over single node performance.<p>Possibly this is also just the passage of time benefiting newer systems.
How does the CockroachDB approach not deadlock? Surely retrying could encounter a situation where two competing UPDATE will lock rows in different order, and no amount of retrying will unlock the required rows, right?
I'm having trouble with the example given.<p>If `UPDATE player SET level = 'AA' WHERE team = 'Gophers';` is executed before the player swap, then why should "Stonebreaker" be upgraded to "AA"? I'd be pretty mad at my database if I sent those 2 queries <i>in sequence</i> and my DB decided to re-order them.<p>The sleep actually really complicates things here. I understand some queries run slower than others and the sleep is a useful tool to artificially slow things down, but now I don't know I don't know if I should interpret that as one command or two. If `WITH sleep AS (SELECT pg_sleep(5))
UPDATE player SET level = 'AA' FROM sleep WHERE team = 'Gophers';` is atomic then I'd expect it to put a lock on the 3 Gophers (which doesn't include Stonebreaker), wait the 5 seconds and then complete the update. The player swap would be blocked for those 5 seconds because it touches a row that's being updated.
For similar isolation level anomalies in real world applications check out this SIGMOD '17 paper:<p>ACIDRain: Concurrency-Related Attacks on
Database-Backed Web Applications: <a href="http://www.bailis.org/papers/acidrain-sigmod2017.pdf" rel="nofollow">http://www.bailis.org/papers/acidrain-sigmod2017.pdf</a>
How you incorporate SELECT FOR UPDATE locking into Raft? Also maybe slightly off topic, but how does the CockroachDB deals with read-only queries and raft log?
Idk to me this looks like a modeling issue of the data. There should be a team table that contains team specific data such as the skill level, then these two queries wouldn't run into any problems.