Why Uber Engineering Switched from Postgres to MySQL

731 pointsby myhrvoldalmost 9 years ago

53 comments

jedbergalmost 9 years ago

> MySQL supports multiple different replication modes:> Statement-based replication replicates logical SQL statements (e.g., it would literally replicate literal statements such as: UPDATE users SET birth_year=770 WHERE id = 4)Postgres has that too (using a 3rd party tool, but it's an officially supported tool). We were using it on reddit 10 years ago. It caused a lot of problems. I wouldn't call that an advantage for Mysql.Honestly, reading this it seems like the summary is: "We don't follow great engineering practices so we need a database more forgiving". Which is fine if that's how you want to run your business, but isn't really the death knell for Postgres.A specific example:> This problem might not be apparent to application developers writing code that obscures where transactions start and end. For instance, say a developer has some code that has to email a receipt to a user. Depending on how it’s written, the code may implicitly have a database transaction that’s held open until after the email finishes sending. While it’s always bad form to let your code hold open database transactions while performing unrelated blocking I/O, the reality is that most engineers are not database experts and may not always understand this problem, especially when using an ORM that obscures low-level details like open transactions.Your developer should understand database transactions. But you should make it easier for them by abstracting it so that they don't have to. And in this particular case, I'd say they shouldn't be using the database to do locking around sending a receipt. It should be put into a queue and that queue should be processed separately, which avoids the transaction problem altogether.

评论 #12169473 未加载

评论 #12167741 未加载

评论 #12168036 未加载

评论 #12168601 未加载

评论 #12169345 未加载

评论 #12167848 未加载

评论 #12167801 未加载

评论 #12169767 未加载

ledjonalmost 9 years ago

I would argue that most of these Postgres "flaws" are actually advantages over MySQL when you look at them holistically rather than the very specific Uber use-case.Postgres's MVCC is superior (can rollback DDL, can add indexes online, can have open read transactions for a VERY long time without impacting other parts of the system)Postgres supports many types of indexes, not just b-tree. One thing it doesn't have is clustered b-tree indexes... which is really what MySQL does that makes it somewhat "better." I wonder how Uber adds an index to a table that already has 1B+ rows in it with mysql?Postgres have WAL level replication is a better guarantee of actually replicating the data correctly. I cannot tell you how many times I've had to tell my boss that the "mysql replicas might be slightly out of sync with the master" because of various replication issues. The way it handles triggers and scheduled events alone is garbage and can very easily break replication and/or silently cause inconsistency.As for data corruption, if there is a bug that causes corruption, then there is a bug. I don't think that is a fundamental design flaw as implied in this article. You shouldn't rely on 1/2 assed replication design to accidentally save you from the data corruption bug. There are many downsides to the design MySQL has that are simply not listed here.I have been both a professional MySQL administrator as well as Postgresql (as well as SQL Server and many NoSQL engines). Many of these Postgres issues are only issues at crazy huge scale, and I would say at that point you probably want to move away from relational anyway. MySQL has its own very large set of problems at scale as well.It sounds like Uber is using MySQL as just a data bucket with primary keys ("Schemaless") which is good -- because you can't alter tables to save your life with MySQL.At the end of the data each developer/business needs to use what works for them, but I would really shy away from pointing to this article as a linchpin in the "MySQL vs. Postgres" war (if there even is such a thing.)

评论 #12167392 未加载

评论 #12167413 未加载

评论 #12167318 未加载

评论 #12167518 未加载

评论 #12168895 未加载

评论 #12169015 未加载

评论 #12181299 未加载

评论 #12167327 未加载

sam_pointeralmost 9 years ago

We did something very similar at EA Playfish, at least one alumni of which is part of the Uber engineering team.We used a 2 column InnoDB-backed table for all of our data storage, massively sharded, and run in a 3-host master-slave-slave configuration.At that time EC2 would routinely kill hosts without the courtesy of a poke via ACPI and as such we became very good at quickly recovering shards. In a nutshell this mechanism was to have the new host contact a backup slave, perform an lvm snap, pipe the compressed snap over a TCP connection, unroll it and carry on, letting replication take up the delta.That enabled us to not only manage the 10 million or so daily active users of that title, but was also the platform under the 12 or so additional titles that studio had.We had lots and lots of very simple things and failures were contained.I think at the time we were the 3rd-largest consumer of EC2 after Netflix and "another" outfit I never learned the name of. EA being what it was, however, we were never permitted to open source a lot of the cool stuff Netflix and ourselves seemed to develop in parallel.

评论 #12167291 未加载

fusiongyroalmost 9 years ago

The article could be summed up as "Postgres is not a distributed database." MySQL isn't either, although it certainly has more friendly replication technology. I think it's a lot more likely that what's really happening here is that they've designed their "schemaless" schema or its supporting software to handle the kind of soft errors that MySQL is permitting and Postgres was not.We have MySQL replication across the country where I work and I certainly wouldn't characterize it as robust; it fails every 3-6 months. MySQL replication is certainly a lot older and easier to use than Postgres's, but SQL databases are fundamentally CP systems. When you say "This design means that replicas can routinely lag seconds behind master, and therefore it is easy to write code that results in killed transactions" it sounds like you're blaming the way replication was implemented for a physical problem. There is no way to design a replication system such that two highly-consistent databases can achieve perfect availability in the face of real-world networks. A worse protocol can exacerbate the problem, but a better one can't make it go away.I have never seen corruption with Postgres (unlike MySQL), but I have never tried cross-datacenter replication with it. Apart from that, Postgres generally seems to do much better with consistency than MySQL does, where DDL statements are not transactional, etc. So I am not surprised to hear that their system trips harder on Postgres's more aggressive consistency.In short, I suspect a more robust solution to their problem is a NoSQL database. On the other hand, it sounds like they want a combination of availability and consistency that will be difficult to get off-the-shelf. I'm glad they found a way to make it work. I wouldn't generally choose Postgres for a scalable system with an aggressive availability constraint--but then again, I wouldn't choose MySQL either, and I generally avoid problems that demand highly scalable, highly available solutions.

drobalmost 9 years ago

We've hit a lot of the same fundamental limits scaling PostgreSQL at Heap. Ultimately, I think a lot of the cases cited here in which PostgreSQL is "slower" are actually cases in which it does the Right Thing to protect your data and MySQL takes a shortcut.Our solution has been to build a distribution layer that makes our product performant at scale, rather than sacrificing data quality. We use CitusDB for the reads and an in-house system for the writes and distributed systems operations. We have never had a problem with data corruption in PostgreSQL, aside from one or two cases early on in which we made operational mistakes.With proper tuning and some amount of durability-via-replication, we've been able to get great results, and that's supporting ad hoc analytical reads. (For example, you can blunt a lot of the WAL headaches listed here with asynchronous commit.)

pellaalmost 9 years ago

Roadmaps ( PostgreSQL ) 2016-2017-...* Postgres Professional roadmap ( Pluggable storages, Multimaster cluster with sharding, Effective partitioning, Adaptive query planning, Page-level data compression, Connection pooling, Native querying for jsonb with indexing support, ....) <a href="https://wiki.postgresql.org/wiki/Postgres_Professional_roadmap" rel="nofollow">https://wiki.postgresql.org/wiki/Postgres_Professional_roadm...</a>* EnterpriseDB database server roadmap ( Parallelism, Replication, Vertical Scalability, Performance ) <a href="https://wiki.postgresql.org/wiki/EnterpriseDB_database_server_roadmap" rel="nofollow">https://wiki.postgresql.org/wiki/EnterpriseDB_database_serve...</a>====And "Scalable PostgreSQL for real-time workloads <a href="https://www.citusdata.com" rel="nofollow">https://www.citusdata.com</a> " --> <a href="https://github.com/citusdata/citus" rel="nofollow">https://github.com/citusdata/citus</a>

评论 #12168180 未加载

评论 #12170408 未加载

thinkingkongalmost 9 years ago

Posts like this are important.We too often rely on a buzz-word heuristic and that's how you end up with dozens of random technologies that are harder to maintain and don't necessarily solve any of your problems. This method is good, because it shows that when you understand the problem the right way, you can find the right solution, even if by popularity it looks like a "step backwards"Massive Kudos.

评论 #12167316 未加载

评论 #12167772 未加载

评论 #12167012 未加载

NhanHalmost 9 years ago

Well, this is heresy. Does that mean we are now officially boycotting Uber?Joke asides, one thing I've been trying to figure out for awhile is the limitation at which certain components/ systems broke down. Basically, something along the line of "given X records, this operations would take Y time, or would cause Z A B C problems". I've actually got developers friends asking me how fast a simple "SELECT * FROM X WHERE index=?" would take on a million row table, since they were surprised that some NoSQL DB could do a query on hundred million rows in a few seconds.I guess that's part of why you only learned how to scale after having done it once.

评论 #12167361 未加载

forgotpwtomainalmost 9 years ago

Great write up. A couple points -I'm not sure this post is illustrative of any generally applicable considerations (re: the title) in the choice of Postgresql vs MySQL, since Uber seems to no longer be using a relational model for most of their data and is using MySQL effectively as a key-value store.> say a developer has some code that has to email a receipt to a user. Depending on how it’s written, the code may implicitly have a database transaction that’s held open until after the email finishes sending. While it’s always bad form to let your code hold open database transactions while performing unrelated blocking I/O, the reality is that most engineers are not database experts and may not always understand this problem, especially when using an ORM that obscures low-level details like open transactions.I have to very seriously disagree here, ORMs make a lot of things easy - and you can get away with building stuff for a while without understanding the underlying databases or SQL but only to a certain scale (I'd say more like medium-scale, definitely not large or Uber level). If you have engineers writing code that interacts with a database without understanding transactional semantics, the engineer in question not the database is the problem.> We started out with Postgres 9.1 and successfully completed the upgrade process to move to Postgres 9.2. However, the process took so many hours that we couldn’t afford to do the process again.There seem to be ways [0][1] to do online upgrades with Postgres (before logical decoding in 9.4), although I haven't personally used them. Not sure if they explored these options at Uber or not?[0] <a href="https://github.com/markokr/skytools" rel="nofollow">https://github.com/markokr/skytools</a> [1] <a href="http://slony.info/" rel="nofollow">http://slony.info/</a>

评论 #12167681 未加载

BadassFractalalmost 9 years ago

I've heard from technical leaders at multiple now well established unicorns how they'd never use postgres or switched from postgres simply because MySQL has a lot more tooling built it and many more people are exposed to its shortcomings at "web scale" so that it's very well known where and when things will break.Disclaimer, I'm a hardcore Postgres user myself, but I also keep tabs on the other tools.

评论 #12167025 未加载

评论 #12166967 未加载

quotemstralmost 9 years ago

> Each of these system calls incurs a context switchSystem calls are not context switches. I wish people would distinguish between them. A system call is just a change of privilege level and is efficient --- there's no cache invalidation required on almost any system.A context switch, on the other hand, involves a call to the scheduler, saving and restoring of much more CPU register state, and various kinds of cache invalidation. (It's even more expensive if you're switching between different processes instead of different threads in the same process.)The kernel may perform a context switch while executing a system call --- this context switch is what makes blocking calls blocking. But even IO system calls do not necessarily cause context switches, especially in the case where an operation can be satisfied by accessing only the page cache.tl;dr A system call is not necessarily a context switch

Illniyaralmost 9 years ago

So the major issue detailed here is that postgres basically uses immutables rows which creates performance issues with writes.Just read about their new schemaless db in their blog an the first paragraph contains this:"The basic entity of data is called a cell. It is immutable, and once written, it cannot be overwritten. (In special cases, we can delete old records.) A cell is referenced by a row key, column name, and ref key. A cell’s contents are updated by writing a new version with a higher ref key but same row key and column name."So, mmm..., not saying that postgres didn't pose a problem for them but I think postgres' db model fits better to their new db then mysql. They probably had to work really hard to get mysql to work like postgres.Without this issue, it looks like two things needed.to be done with postgres that would have solved their problems have indexes that point to primary id and do logical replication (which they say a plugin solved in 9.4).Is this a case of "I got burned by something so I won't use it again"

评论 #12167555 未加载

0xmohitalmost 9 years ago

Facebook maintains it's own fork [0] of MySQL. A couple of interesting talks are also available: MySQL at Facebook, Current and Future [1] and Massively Distributed Backup at Facebook Scale [2].[0] <a href="https://github.com/facebook/mysql-5.6" rel="nofollow">https://github.com/facebook/mysql-5.6</a>[1] <a href="https://www.youtube.com/watch?v=jqwegP9xwVE" rel="nofollow">https://www.youtube.com/watch?v=jqwegP9xwVE</a>[2] <a href="https://www.youtube.com/watch?v=UBHcmP2TSvk" rel="nofollow">https://www.youtube.com/watch?v=UBHcmP2TSvk</a>

评论 #12166893 未加载

评论 #12167009 未加载

评论 #12167862 未加载

评论 #12168885 未加载

评论 #12169278 未加载

pritambaralalmost 9 years ago

I wonder what the design decisions are behind (or what it would take) to make Postgres store secondary indexes on disk like InnoDB does. Sure, the extra index lookup through the primary index is a cost, but it seems like write-amplification can sure be a greater concern too. Ultimately, it would be nice if Postgres gave the DBA a choice of — if not move outright to — secondary-index indirection through the primary index like InnoDB does.

评论 #12167161 未加载

denishpatelalmost 9 years ago

They migrated from MySQL to Postgres (<a href="https://www.yumpu.com/en/document/view/53683323/migrating-uber-from-mysql-to-postgresql" rel="nofollow">https://www.yumpu.com/en/document/view/53683323/migrating-ub...</a>) with almost same reasons and now they are counter argumenting for their inability to use software and lack of skills to upgrade!

scotty79almost 9 years ago

Funny how the article is just:- we used X in a fashion that suited us best- it caused us problems Y because of some technicalities of X- so we switched to Z and we could avoid Y thanks to how Z handles the technicalities differently than Yand the top rated HN comments are:- you used the X wrong- all the technicalities of X that caused you problems Y are actually superior features of X

评论 #12171602 未加载

appleflaxenalmost 9 years ago

This was a great overview and write-up.Anyone know why they are using MySQL over MariaDB[1]?1. <a href="https://mariadb.org/" rel="nofollow">https://mariadb.org/</a>

评论 #12167215 未加载

评论 #12166919 未加载

评论 #12167837 未加载

jswnyalmost 9 years ago

It really is sad that we all just can't appreciate a company like Uber giving us insight into their internal engineering choices. So many people on here think Postgres is so perfect that if you don't like it you must be using it wrong. Postgres is a tool like anything else. It has good use cases, and bad use cases.

pellaalmost 9 years ago

anno 2013 "MIGRATING UBER FROM MYSQL TO POSTGRESQL"<a href="https://www.yumpu.com/en/document/view/53683323/migrating-uber-from-mysql-to-postgresql" rel="nofollow">https://www.yumpu.com/en/document/view/53683323/migrating-ub...</a>

markpapadakisalmost 9 years ago

Great write-up. A few observations:1. The encoding and translation schemes of Postgres and mySQL/InnoDB are well described in the blog post, and I would also agree that InnoDB’s design is, all things considered, better for all the reasons outlined in the post.2. I don’t understand why anyone still uses lseek() followed by read()/write() and not pread()/pwrite() syscalls. It’s trivial to replace the pair of calls with one. Aerospike is another datastore that resorts to pairs of seek/red-write instead of pread/pwrite calls.3. Process/connection model makes no real sense nowadays - although to be fair, there is, today, practically almost no difference in terms of footprint between OS threads and OS processes (other than memory and FDs sharing semantics, they are practically the same). It’s still more appropriate to use threads (although I ‘d argue maintaining a pool of threads for processing requests and one/few threads for multiplexing network I/O is the better choice).4. ALTER TABLE is obviously a pain point with mySQL, although I am not really sure many users with large datasets care; they probably figured out long ago it’s going to be an issue and they designed and expanded accordingly. It’s also a relatively rare operation. That said, other than using mySQL (or any other RDBMS) to build the data plane for an elaborate, distributed KV store, one should consider Salesforce’s approach too. Their tables have some 50 or so columns, and the column names are generic (e.g column_0, column_1, … ). They have a registry where they assign column indices (e.g column_0) to a specific high-level entity type (e.g customer title, or price), and whenever they need to query, they just translate from the high level entity to the actual column names and it works. They also, IIRC, use other tables to index those columns (e.g such an index table can have just 3 columns, table id, column index, value) and they consult that index when needed (FriendFeed did something similar).5. Cassandra should have no problem supporting the operations and semantics of Shemaless ass described in their blog posts. However, given they already operate it in production, they probably considered it and decided against it.

评论 #12169159 未加载

0xmohitalmost 9 years ago

Worth quoting from the article:<pre><code> Accordingly, using pgbouncer to do connection pooling with Postgres has been generally successful for us. However, we have had occasional application bugs in our backend services that caused them to open more active connections (usually “idle in transaction” connections) than the services ought to be using, and these bugs have caused extended downtimes for us.</code></pre>

评论 #12167416 未加载

vanviegenalmost 9 years ago

One major advantage of MySQL's clustered indexes the article doesn't mention is that, although secondary key reads may be a little slower, primary key reads will be faster. The row data lives in the primary key index, so there is no need for referencing an additional database page (possibly causing random I/O).This is especially relevant when doing range queries over the primary key. Imagine a table containing billions of chat messages, from which you want to retrieve a single conversation history. With a clustered primary key on (conversation id, message id), MySQL would need to process just a couple of database pages. Postgres, on the other hand, would need to reference a semi-random page for each of the messages.Now imagine a 10k message chat conversation, a table too large to fit into RAM, and storage by means of spinning rust (yeah, yeah, I know what year it is :-)). The difference would be somewhere between 2 and 3 orders of magnitude.

评论 #12170903 未加载

jasodealmost 9 years ago

Fyi... a related (not duplicate) discussion of a previous Uber story: <a href="https://news.ycombinator.com/item?id=10923848" rel="nofollow">https://news.ycombinator.com/item?id=10923848</a>

trequartistaalmost 9 years ago

Wow, this is such a detailed analysis. Having used Postgres and suffered issues with data replication as well as database crashes, this post was really helpful.

ismdubeyalmost 9 years ago

The fact that Uber scaled to so many users with Postgress gives me such a relief. For now, I am good !!

cdelsolaralmost 9 years ago

So basically, if you don't intend to use it as a relational database, and you have enough scale to run cross-data-center (and across-the-world) master-master replication, then you should maybe switch from PostgreSQL to MySQL?

vbezhenaralmost 9 years ago

Why would anyone run hundreds of connections? A server can only process number_of_processor_cores connections at once. Sure, few connections might wait for I/O, but not hundreds, unless database is very untypical.

评论 #12167224 未加载

评论 #12167570 未加载

评论 #12167262 未加载

评论 #12167324 未加载

viraptoralmost 9 years ago

I'd like to see their migration strategy as well. I mean, they say moving from pgsql 9.2 to higher version (which then allows online upgrades) is too much work. Yet they'll have to migrate to mysql, which will take much more engineering effort. For anything close to realtime, they'll need to copy the old data, while at the same time forking the new writes into both pgsql slaves and new mysql servers. And they cannot use WAL for that without some advanced processing.I hope this follows in the next blog post.

sriharisalmost 9 years ago

A common solution to conserve bandwidth is to use compression. This can be done easily in PostgreSQL by using ssh tunnels and turning on compression. I wonder why they didn't try that.

评论 #12171956 未加载

branduralmost 9 years ago

Interesting post! While I suspect that a MySQL installation is just as likely to have its own problems in the long run, I'm not smart enough to provide any kind of compelling point-by-point refutation. However, a number of the points made strike me as having possible trade-offs that were not really addressed in-depth.My summary of the arguments against Postgres and some basic thoughts on each:1. Writes are more expensive because all secondary indexes must be updated with a new physical location.This may be true, but the MySQL model of using primary keys from secondary indexes will mean that reads are inherently expensive. They even mention this:> This design means that InnoDB is at a slight disadvantage to Postgres when doing a secondary key lookup, since two indexes must be searched with InnoDB compared to just one for Postgres.So it seems like a classic read vs. write trade-off.I'm also a little skeptical of any performance claims that don't include any numbers. It's possible that efficient coding in Postgres makes this much more of a wash in terms of performance than claimed here.2. Replication is less efficient because it's sending a lot of physical information out along the stream.This is quite true, but IMO unlikely to be a major issues for most users unless they're dealing with a huge amount of data and streaming it over a slow connection (i.e. across the continent like Uber's disaster recovery center).3. Data corruption from a bug found in 9.2.Certainly a bad situation, but IMO not really a valid claim for situation. 9.2 is way behind at this point, and there's not much to say that they wouldn't have encountered a similar bug or something worse in MySQL in all that time, especially operating at scale.To give a counter-anecdote, I operated Postgres at scale for a long time across many versions starting at 9.1 and was lucky enough to have never once encountered a bug with data corruption.4. Postgres' MVCC model makes it easy for replicas to accidentally fall behind their master.This one is valid (and annoying), but there are very good reasons for it, and you have some switches to control the behavior based on value transactions finishing on followers or prompt replication more highly.5. Upgrades are difficult because the WAL stream works at a physical level and is not compatible between database versions.Again, this is valid, but the statement-based replication is a scary idea. Row-level replication is more interesting and probably something that Postgres should have though.Some good news is that Postgres is getting closer to logical WAL streaming, which should make in-place upgrades possible.

评论 #12167501 未加载

polskibusalmost 9 years ago

Does anyone know if citusdb or enterprisedb improve on the postgresql issues mentioned in the post vs last postgresql version?

评论 #12167301 未加载

niermanalmost 9 years ago

with respect to "Difficulty upgrading to newer releases":pg_upgade has a --link option which uses hard links in the new cluster to reference files from the old cluster. This can be a very fast way to do upgrades even for large databases (most of the data between major versions will look the same; perhaps only some mucking with system catalogs is required in the new cluster). Furthermore, you can use rsync with --hard-links to very quickly upgrade your standby instances (creating hard links on the remote server rather than transferring the full data).that is all referenced in the current documentation: <a href="https://www.postgresql.org/docs/current/static/pgupgrade.html" rel="nofollow">https://www.postgresql.org/docs/current/static/pgupgrade.htm...</a>

macealmost 9 years ago

I think a fair summary is "We had a few problems with PostgreSQL mostly due to our rapid growth. We rethought the problem and changed the way we use a relational database for large-scale storage and are now using MySQL as a dumb key-value store."The conclusion reinforces this: "Postgres served us well in the early days of Uber, but we ran into significant problems scaling Postgres with our growth."I read this a both endorsement of PostgreSQL as well as highlighting some of the problems that any large-scale use of it would run into.

snarfyalmost 9 years ago

I wonder how much of this could have been solved by using a different file system. There is all of this talk about the physical layer but no mention of the file system used.> Typically, write amplification refers to a problem with writing data to SSD disks: a small logical update (say, writing a few bytes) becomes a much larger, costlier update when translated to the physical layer.This is exactly the type of problem solved by the file system layer.

评论 #12167822 未加载

mspradleyalmost 9 years ago

Why did they not consider Oracle or MS SQL Server? They can afford the licensing and both have numerous replication technologies to choose from.

评论 #12168069 未加载

manigandhamalmost 9 years ago

They should use SQL Server (which has great replication abilities, although horizontal scale out is still difficult) or MemSQL (which is distributed, scalable, and can do everything they need).Or use Cassandra which is a perfect fit (or ScyllaDB which is a better version of it).This all sounds like an aversion to just paying for or using better products when the problem is easily solved.

评论 #12168014 未加载

exabrialalmost 9 years ago

My big think with MySQL is that the last time I needed to make this decision (about 3yr ago), the support tooling, community knowledge, and documentation around MySQL was light years ahead of Postgres. There were literally hundreds of MySQL clients and utilities and Postgres was "just a database".Competition breeds excellence.

simon2Qalmost 9 years ago

I've tried to produce a full reply to most of the technical points raised there. <a href="http://blog.2ndquadrant.com/thoughts-on-ubers-list-of-postgres-limitations/" rel="nofollow">http://blog.2ndquadrant.com/thoughts-on-ubers-list-of-postgr...</a>

_navaneethanalmost 9 years ago

From this article, Can i assume the below point?why postgres is designed such a way of "physical replication" rather than the design of mysql's ONLY "logical replication"?Because postgres empowering data integrity with help of forceful constraints.

hyperion2010almost 9 years ago

This is a fantastic read. I hope the pg folks can turn as many of the issues brought up here into bug reports as possible (I think many of the issues, especially re: replication, are known), this kind of feedback is invaluable.

评论 #12167147 未加载

raisyeralmost 9 years ago

while Postgress might be better if you use it 'as-is'.... community of MySQL is much better and the tools available are more mature... just goes on to prove that even if something is not-that-good.. it still might be successful,scalable and popular if there is a strong community behind it..

评论 #12167461 未加载

distantsoundsalmost 9 years ago

Perhaps their engineers can design a web page that allows the scroll wheel to work.

madhusudhan000almost 9 years ago

So let me try to summarise this.Poor replica MVCC supportThey are actually pointing to a blog article written in 2010 -> <a href="http://blog.2ndquadrant.com/tradeoffs_in_hot_standby_deplo/" rel="nofollow">http://blog.2ndquadrant.com/tradeoffs_in_hot_standby_deplo/</a>Do they realise that it is 2016 ?Guess they did't bother to understand the hot standby feedback system.> Postgres’s design resulted in inefficiencies and difficulties for our data at Uber.What kind of inefficiency ? The explain what is purpose of WAL and replication which every database person knows about but didn't care to explain the actual problem at hand ?Data corruption> During a routine master database promotion to increase database capacity, we ran into a Postgres 9.2 bugWhy the heck didn't they upgrade to a newer version ? Did you report this bug to pg dev , did they take so much time to fix this, or were you just assuming that the bug could fix itself ?> The bug we ran into only affected certain releases of Postgres 9.2 and has been fixed for a long time now. However, we still find it worrisome that this class of bug can happen at all.Postgres 9.2 is pretty old and there has been 3 major releases after that. WTF ?I can say countless instances where MySQL data corruption was a constant nuisance with version 5.5 and they have fixed it with newer releases.Replication> During peak traffic early on, our bandwidth to the storage web service simply wasn’t fast enough to keep up with the rate at which WALs were being written to itSo you have run into a hardware limitation and then blame postgres. What was limit that you hit ? I don't understand this point at all.Concept of context switchingI am surprised that this is actually an issue, in a database the slowest part is always the disk and not the CPU. Confused on how did they hit this limitation first without actually touching others.Time taken by a context switch : <a href="http://stackoverflow.com/questions/21887797/what-is-the-overhead-of-a-context-switch" rel="nofollow">http://stackoverflow.com/questions/21887797/what-is-the-over...</a>Which is in microseconds.InnoDB buffer pool> By comparison, the InnoDB storage engine implements its own LRU in something it calls the InnoDB buffer poolPostgres has something similar called shared_buffer. They are speaking as if postgres relies entirely on the operating system which is false.> It makes it possible to implement a custom LRU design. For instance, it’s possible to detect pathological access patterns that would blow out the LRU and prevent them from doing too much damageNot sure what kind of damage they are speaking. In a postgres sequential scan (full table scan), a ring buffer is used instead and does not result in the shared buffers being blown away.If you need a custom LRU design, there is definitely something wrong in the way that you are using an OLTP database.Connection HandlingThis is complete BS. Nobody uses databases without connection pools. Agree that a thread is more lightweight than a process, but you would never hit this limit at all in real time which is in the order of microseconds again. In a production system, one would open connections immediately and then hold them in the connection pool. This overhead is almost not visible at all. If you are constantly opening and closing connections then there is something seriously wrong with your design.> However, we have had occasional application bugs in our backend services that caused them to open more active connections (usually “idle in transaction” connections) than the services ought to be using, and these bugs have caused extended downtimes for usSo they are blaming the database for a bug in their design/system. Computers are no match for human stupidity.> Accordingly, using pgbouncer to do connection pooling with Postgres has been generally successful for us.Again what is the problem, the whole article smells more and more like a useless rant, just because you dont know how to use them ?ConclusionAnother thing is that they have not given any kind of query/access pattern in which they use postgres/mysql. They put in a couple of low level things and then say that postgres is badly designed.I can think of only two logical explanations1) The article writer was already familiar with MySQL and they didn't bother to even dig into postgres deeper2) They have been paid by oracle :P

评论 #12176567 未加载

postilaalmost 9 years ago

Well done :-) <a href="https://twitter.com/LeviNotik/status/757991465649778689" rel="nofollow">https://twitter.com/LeviNotik/status/757991465649778689</a>

xaprbalmost 9 years ago

Why not PostgreSQL? (Sorry, someone had to say it.)

zeeshanmalmost 9 years ago

I like the sample data they have used:<pre><code> id first last birth_year 1 Blaise Pascal 1623 2 Gottfried Leibniz 1646 3 Emmy Noether 1882 4 Muhammad al-Khwārizmī 780 5 Alan Turing 1912 6 Srinivasa Ramanujan 1887 7 Ada Lovelace 1815 8 Henri Poincaré 1854</code></pre>

评论 #12167260 未加载

评论 #12167029 未加载

评论 #12167486 未加载

dschiptsovalmost 9 years ago

Nice to see how Postgres (a relative of Informix) follows an old-school maxim to focus on Consistency and Durability by being "append-only" and never over-writing the data. Sticking to the right principles is better than over-optimization.The Uber engineers should, perhaps, take a look at Changelogs of last 5 or so releases of MySQL to see how many bugs in InnoDB has been found in each release and read stories about data loses due to inability to repair storages.According to old-school DBA tradition, it is much better to have an straightforward storage engine based on right principles written in C than fancy storage written in C++. At least if one values ones data.Well, in the age of in-memory "databases", "durability through replication" and "eventual consistency" old school focus on disk commits might sound a bit outdated, until one gets that moment when shards got messed up and there is no way to know how many writes are missing and where.Database is a durable storage which guarantees data consistency and ability to roll-back to a clean state through direct-access (by passing all caches) writes. At lest this is what we had in glorious times of IDS 7.3

frikalmost 9 years ago

Also check out highscalability.com for more stories that value MySQL and its great InnoDB engine: <a href="http://highscalability.com/blog/category/mysql" rel="nofollow">http://highscalability.com/blog/category/mysql</a>

morekozhambualmost 9 years ago

There is this interesting talk on MySQL vs Postgres.<a href="https://www.youtube.com/watch?v=emgJtr9tIME" rel="nofollow">https://www.youtube.com/watch?v=emgJtr9tIME</a>

cnfjdnxalmost 9 years ago

Uber """"""""""engineering""""""""""

kev009almost 9 years ago

I like this transparency, I know I never want to work at uber.

slantedviewalmost 9 years ago

The connection handling section was surprising to me, reading that Postgres uses a process per connection! This is pretty shocking to me, in a bad way.

评论 #12167006 未加载

评论 #12166961 未加载

评论 #12167221 未加载

评论 #12167314 未加载

swasheckalmost 9 years ago

this reads like a laundry list of buzzwords that were designed to justify not throwing any effort into postgresql and just going with a new shiny toy (not mysql. yes. i know it's been around for a while).it happens everywhere.

评论 #12166839 未加载

评论 #12166843 未加载

评论 #12166851 未加载

评论 #12166980 未加载

评论 #12167146 未加载

评论 #12167125 未加载

53 comments

jedbergalmost 9 years ago

评论 #12169473 未加载

评论 #12167741 未加载

评论 #12168036 未加载

评论 #12168601 未加载

评论 #12169345 未加载

评论 #12167848 未加载

评论 #12167801 未加载

评论 #12169767 未加载

ledjonalmost 9 years ago

评论 #12167392 未加载

评论 #12167413 未加载

评论 #12167318 未加载

评论 #12167518 未加载

评论 #12168895 未加载

评论 #12169015 未加载

评论 #12181299 未加载

评论 #12167327 未加载

sam_pointeralmost 9 years ago

评论 #12167291 未加载

fusiongyroalmost 9 years ago

drobalmost 9 years ago

pellaalmost 9 years ago

评论 #12168180 未加载

评论 #12170408 未加载

thinkingkongalmost 9 years ago

评论 #12167316 未加载

评论 #12167772 未加载

评论 #12167012 未加载

NhanHalmost 9 years ago

评论 #12167361 未加载

forgotpwtomainalmost 9 years ago

评论 #12167681 未加载

BadassFractalalmost 9 years ago

评论 #12167025 未加载

评论 #12166967 未加载

quotemstralmost 9 years ago

Illniyaralmost 9 years ago

评论 #12167555 未加载

0xmohitalmost 9 years ago

评论 #12166893 未加载

评论 #12167009 未加载

评论 #12167862 未加载

评论 #12168885 未加载

评论 #12169278 未加载

pritambaralalmost 9 years ago

评论 #12167161 未加载

denishpatelalmost 9 years ago

scotty79almost 9 years ago

评论 #12171602 未加载

appleflaxenalmost 9 years ago

This was a great overview and write-up.Anyone know why they are using MySQL over MariaDB[1]?1. <a href="https://mariadb.org/" rel="nofollow">https://mariadb.org/</a>

评论 #12167215 未加载

评论 #12166919 未加载

评论 #12167837 未加载

jswnyalmost 9 years ago

pellaalmost 9 years ago

markpapadakisalmost 9 years ago

评论 #12169159 未加载

0xmohitalmost 9 years ago

评论 #12167416 未加载

vanviegenalmost 9 years ago

评论 #12170903 未加载

jasodealmost 9 years ago

Fyi... a related (not duplicate) discussion of a previous Uber story: <a href="https://news.ycombinator.com/item?id=10923848" rel="nofollow">https://news.ycombinator.com/item?id=10923848</a>

trequartistaalmost 9 years ago

Wow, this is such a detailed analysis. Having used Postgres and suffered issues with data replication as well as database crashes, this post was really helpful.

ismdubeyalmost 9 years ago

The fact that Uber scaled to so many users with Postgress gives me such a relief. For now, I am good !!

cdelsolaralmost 9 years ago

vbezhenaralmost 9 years ago

评论 #12167224 未加载

评论 #12167570 未加载

评论 #12167262 未加载

评论 #12167324 未加载

viraptoralmost 9 years ago

sriharisalmost 9 years ago

A common solution to conserve bandwidth is to use compression. This can be done easily in PostgreSQL by using ssh tunnels and turning on compression. I wonder why they didn't try that.

评论 #12171956 未加载

branduralmost 9 years ago

评论 #12167501 未加载

polskibusalmost 9 years ago

Does anyone know if citusdb or enterprisedb improve on the postgresql issues mentioned in the post vs last postgresql version?

评论 #12167301 未加载

niermanalmost 9 years ago

macealmost 9 years ago

snarfyalmost 9 years ago

评论 #12167822 未加载

mspradleyalmost 9 years ago

Why did they not consider Oracle or MS SQL Server? They can afford the licensing and both have numerous replication technologies to choose from.

评论 #12168069 未加载

manigandhamalmost 9 years ago

评论 #12168014 未加载

exabrialalmost 9 years ago

simon2Qalmost 9 years ago

_navaneethanalmost 9 years ago

hyperion2010almost 9 years ago

评论 #12167147 未加载

raisyeralmost 9 years ago

评论 #12167461 未加载

distantsoundsalmost 9 years ago

Perhaps their engineers can design a web page that allows the scroll wheel to work.

madhusudhan000almost 9 years ago

评论 #12176567 未加载

postilaalmost 9 years ago

Well done :-) <a href="https://twitter.com/LeviNotik/status/757991465649778689" rel="nofollow">https://twitter.com/LeviNotik/status/757991465649778689</a>

xaprbalmost 9 years ago

Why not PostgreSQL? (Sorry, someone had to say it.)

zeeshanmalmost 9 years ago

评论 #12167260 未加载

评论 #12167029 未加载

评论 #12167486 未加载

dschiptsovalmost 9 years ago

frikalmost 9 years ago

morekozhambualmost 9 years ago

There is this interesting talk on MySQL vs Postgres.<a href="https://www.youtube.com/watch?v=emgJtr9tIME" rel="nofollow">https://www.youtube.com/watch?v=emgJtr9tIME</a>

cnfjdnxalmost 9 years ago

Uber """"""""""engineering""""""""""

kev009almost 9 years ago

I like this transparency, I know I never want to work at uber.

slantedviewalmost 9 years ago

The connection handling section was surprising to me, reading that Postgres uses a process per connection! This is pretty shocking to me, in a bad way.