Jepsen: MySQL 8.0.34

364 pointsby aphyrover 1 year ago

16 comments

tzoneover 1 year ago

I have been advocating for the longest time that "repeatable read" is just a bad idea. Even if implementations were perfect. Even when it works correctly in the Database, it is still very tricky to reason about when dealing with complex queries.I think two isolation levels that make sense are either:* read committed* serializableYou either go all the way to have a serializable setup, where there are no surprises. OR, you go in read committed direction where it is obvious that if you want have a consistent view of the data within a transaction, you have to lock the rows before you start reading them.Read committed is very similar to just regular multi-threaded code and its memory management, so most engineers can get a decent intuitive sense for it.Serializable is so strict that it is pretty hard to make very unexpected mistakes.Anything in-between is a no man's land. And anything less consistent than Read Committed is no longer really a database.

评论 #38705215 未加载

评论 #38707157 未加载

评论 #38704517 未加载

评论 #38704474 未加载

pellaover 1 year ago

FOSSDEM-2024 :Isolation Levels and MVCC in SQL Databases: A Technical Comparative Study// Oracle, MySQL, SQL Server, PostgreSQL, and YugabyteDB.<a href="https://fosdem.org/2024/schedule/event/fosdem-2024-3600-isolation-levels-and-mvcc-in-sql-databases-a-technical-comparative-study/" rel="nofollow noreferrer">https://fosdem.org/2024/schedule/event/fosdem-2024-3600-isol...</a>

评论 #38718725 未加载

amlutoover 1 year ago

How does append (a) map onto actual SQL operations on the given tables? Are the TEXT fields being used as lists?Also… I’ve been issues in MySQL repeatable read mode where a single SELECT, selecting a single row, returned impossible results. I think it was:<pre><code> SELECT min(value), max(value) FROM table WHERE id = 1; </code></pre> where id is a primary key. I got two different values for min and max. That was a fun one.

评论 #38696828 未加载

评论 #38696799 未加载

Corradoover 1 year ago

I appreciate the write-up and the nod to AWS RDS. However, I was wondering if there was any focus on AWS Aurora (MySQL)? For those that don't know, AWS build a protocol compatible database platform that pretends to be MySQL or PostgreSQL. It would be interesting to see if Aurora MySQL has the same "features" as RDS or even MariaDB.

评论 #38707747 未加载

评论 #38713364 未加载

PeterZaitsevover 1 year ago

Facinating read. I think it is a great illustration to show how many "practically working systems" can be built on the foundation exhibiting so many consistency artifacts

评论 #38704482 未加载

dasmoopover 1 year ago

The RDS replication that stopped working after 5min messing with it, with no alert of failed health check is a bit worrying...

评论 #38701394 未加载

评论 #38704537 未加载

camgunzover 1 year ago

> In 2022 Jepsen commissioned the University of Porto’s INESC TEC to develop LazyFS: a FUSE filesystem for simulating the loss of un-fsynced writesI love this; what a great example of pushing the state of the art forward. Kudos!

baqover 1 year ago

how about that, I planned to do some work today.aphyr, thank you. call me maybe and later jepsen.io have been consistently some of the best content I've ever read on the internet.

评论 #38697078 未加载

hipadev23over 1 year ago

"SELECT ... FOR UPDATE" seems to be the answer to all these issues right? Lock the rows you're going to be updating and suddenly everything works as advertised.

评论 #38704431 未加载

评论 #38704291 未加载

klysmover 1 year ago

In my experience, most developers don't even consider isolation level in the first place and just take whatever the default is. Any race conditions are met with an 'oh that's weird', and then they move on.

评论 #38696811 未加载

评论 #38696489 未加载

评论 #38703685 未加载

评论 #38697928 未加载

PeterCorlessover 1 year ago

How much of what is contained within this analysis of MySQL is going to be the same-same for MariaDB, given that it uses InnoDB as the default storage engine?

评论 #38704095 未加载

评论 #38703962 未加载

评论 #38701427 未加载

评论 #38699170 未加载

beltsazarover 1 year ago

I understand why the default transaction isolation level of most DBMS is weaker than serializable (it's for benchmark purposes), but I'd argue the best default is serializable. Most DBMS users don't even know there are many consistency models [1]. They expect transactions to "just work," i.e. to appear to have occurred in some total order, which is the definition of serializability [2]. And to some who know when to use a weaker isolation level for better performance can always set it per transaction [3].---[1] <a href="https://jepsen.io/consistency" rel="nofollow noreferrer">https://jepsen.io/consistency</a>[2] <a href="https://jepsen.io/consistency/models/serializable" rel="nofollow noreferrer">https://jepsen.io/consistency/models/serializable</a>[3] <a href="https://www.postgresql.org/docs/16/sql-set-transaction.html" rel="nofollow noreferrer">https://www.postgresql.org/docs/16/sql-set-transaction.html</a>

评论 #38696877 未加载

评论 #38700919 未加载

评论 #38700392 未加载

评论 #38697651 未加载

评论 #38697990 未加载

评论 #38696784 未加载

评论 #38701279 未加载

评论 #38696868 未加载

karmakazeover 1 year ago

My takeaways:> 4.2 Recommendations> The core problem is that MySQL claims to implement Repeatable Read but actually provides something much weaker. We see two avenues to resolve this problem.> The first is to keep MySQL’s behavior as it is, and to clearly document the consistency model “Repeatable Read” actually provides. There is precedent in other databases: PostgreSQL’s Repeatable Read is actually Snapshot Isolation, and exhibits behaviors which violate PL-2.99 Repeatable Read. However, PostgreSQL’s documentation eventually mentions that their Repeatable Read implementation is actually Snapshot Isolation. MySQL could similarly document that their “Repeatable Read” means “Read Committed, plus some sort of guarantees that hold until the transaction writes something, at which point mysteries occur.” A precise characterization of those mysteries would be most welcome.Calling what MySQL's "Repeatable Read" as "Read Committed plus..." would be more confusing as even the simplest repeated read without mutations wouldn't work as expected. The documentation should be more upfront about how MySQL "Repeatable Read" doesn't mean what might be expected. In the meantime keep the "MySQL consistent read documentation"[0] close by.> The second option is to treat these behaviors as bugs and fix them. Jepsen would be delighted if MySQL and other vendors were to commit to providing PL-2.99 Repeatable Read. However, even satisfying the incomplete, ambiguous ANSI definition of Repeatable Read would be an improvement over current affairs.I doubt this would be feasible with the extent of deployment. At scale bug-fixes are bugs in themselves. The best that could be done is to create distinct isolation levels for the existing MySQL-RR and the compliant RR isolation levels (somewhat akin to the utf8/utf8mb4 evolution).[0] <a href="https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html" rel="nofollow noreferrer">https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-re...</a>

PeterZaitsevover 1 year ago

What would be helpful is not just comparison to theoretical definition to isolation modes but rather comparison to other popular relational databases - PostgreSQL, MS SQL, Oracle ? Something developers need to mind if they want to assure compatibility

评论 #38707803 未加载

rob-olmosover 1 year ago

Related: ACIDRain for the repeatable read without explicit "select .. for update" locking gotcha gift that still keeps on giving: <a href="https://news.ycombinator.com/item?id=20027532">https://news.ycombinator.com/item?id=20027532</a>

adontzover 1 year ago

Serious question. I have this question for, like 20 years already.Why would anyone start a new project with MySQL? Is it really superior in anything? I'm in industry for 20+ years and as far as I remember MySQL was always the worst and most popular RDBMS at any given moment.

评论 #38696744 未加载

评论 #38707232 未加载

评论 #38696315 未加载

评论 #38703532 未加载

评论 #38696282 未加载

评论 #38696068 未加载

评论 #38696630 未加载

评论 #38696205 未加载

评论 #38700830 未加载

评论 #38696346 未加载

评论 #38698858 未加载

评论 #38696760 未加载

评论 #38704097 未加载

评论 #38696833 未加载

评论 #38707268 未加载

评论 #38704065 未加载

评论 #38706644 未加载

评论 #38701260 未加载

16 comments

tzoneover 1 year ago

评论 #38705215 未加载

评论 #38707157 未加载

评论 #38704517 未加载

评论 #38704474 未加载

pellaover 1 year ago

评论 #38718725 未加载

amlutoover 1 year ago

评论 #38696828 未加载

评论 #38696799 未加载

Corradoover 1 year ago

评论 #38707747 未加载

评论 #38713364 未加载

PeterZaitsevover 1 year ago

Facinating read. I think it is a great illustration to show how many "practically working systems" can be built on the foundation exhibiting so many consistency artifacts

评论 #38704482 未加载

dasmoopover 1 year ago

The RDS replication that stopped working after 5min messing with it, with no alert of failed health check is a bit worrying...

评论 #38701394 未加载

评论 #38704537 未加载

camgunzover 1 year ago

baqover 1 year ago

how about that, I planned to do some work today.aphyr, thank you. call me maybe and later jepsen.io have been consistently some of the best content I've ever read on the internet.

评论 #38697078 未加载

hipadev23over 1 year ago

"SELECT ... FOR UPDATE" seems to be the answer to all these issues right? Lock the rows you're going to be updating and suddenly everything works as advertised.

评论 #38704431 未加载

评论 #38704291 未加载

klysmover 1 year ago

评论 #38696811 未加载

评论 #38696489 未加载

评论 #38703685 未加载

评论 #38697928 未加载

PeterCorlessover 1 year ago

How much of what is contained within this analysis of MySQL is going to be the same-same for MariaDB, given that it uses InnoDB as the default storage engine?