Showdown: MySQL 8 vs. PostgreSQL 10

101 点作者 kenn大约 7 年前

13 条评论

asah大约 7 年前

Great to see MySQL adding this stuff! There's still a ton of reasons to choose Postgres and more and more silicon valley startups seem to be choosing pg - I don't remember the last time I met a startup choosing MySQL.DBMSs are giant complex pieces of software with a million features - it's really hard to compare them. But if I had to sum it up, you can dump freaking line noise into Postgres and then hide the nastiness and manage and query it like a well-designed database. If you have a pretty database, good for you, but today's app writers have gotten lazy with NoSQL record stores and their databases resemble vomitoriums - and let's not talk about what people used to do in the 80s and early 90s.Without further ado, Postgres vomitorium cleanup features:- user defined functions/aggegrates/windowfuncs in your favorite language incl JavaScript, which means you can write tricky business logic once and run it where the data is, vs pulling out millions of records from the database. Language list: <a href="https://www.postgresql.org/docs/10/static/external-pl.html" rel="nofollow">https://www.postgresql.org/docs/10/static/external-pl.html</a>- foreign data wrappers with hundreds of connectors AND a 5 minute toolkit for authoring new wrappers in python and other scripting languages. <a href="https://wiki.postgresql.org/wiki/Foreign_data_wrappers" rel="nofollow">https://wiki.postgresql.org/wiki/Foreign_data_wrappers</a> <a href="http://multicorn.org/" rel="nofollow">http://multicorn.org/</a>- index goddamned anything. Postgres has the most array of index types of any open source database AND if you need, you can easily write a function (in javascript or python!!!) and create an index that's the result of that function call. Postgres even has a full range of partial indices and block range indices, which make it practical to index massive and sparse datasets. <a href="https://www.postgresql.org/docs/current/static/indexes-partial.html" rel="nofollow">https://www.postgresql.org/docs/current/static/indexes-parti...</a> <a href="https://www.postgresql.org/docs/current/static/indexes-expressional.html" rel="nofollow">https://www.postgresql.org/docs/current/static/indexes-expre...</a> <a href="https://www.postgresql.org/docs/current/static/brin-intro.html" rel="nofollow">https://www.postgresql.org/docs/current/static/brin-intro.ht...</a>EXAMPLE: I was once handed a MySQL database of IoT signals where timestamps were in seconds since the epoch and asked to report on this data without changing the database. No biggie: 15 minutes to replicate this OLTP database to a read-only Postgres replica (dumb schema mapping) - then a Postgres VIEW to hide this nastiness and a function index on the timestamp column (to_timestamp).(from memory) CREATE VIEW foo AS SELECT *, to_timestamp(mysql_ts_col) as ts_timestamp FROM replicated_mysql_table; CREATE INDEX foo_ts_inx ON replicated_mysql_table(to_timestamp(mysql_ts_col));At another company, we got dumped a load of JSON and weren't sure how we'd need to parse it. No biggie, I just created indices using function calls that parsed the JSON.- tons of native datatypes and extensible datatypes. In cases where you're handed complex structures or "weird" data that doesn't behave like most programmers expect, you can define new datatypes, then create a library of user defined functions around them. <a href="https://www.postgresql.org/docs/9.5/static/xtypes.html" rel="nofollow">https://www.postgresql.org/docs/9.5/static/xtypes.html</a>- sampling. Postgres has native, low-level support for queries that sample the data, which makes it super fast to explore data while preserving some semblance of statistics. <a href="https://blog.2ndquadrant.com/tablesample-in-postgresql-9-5-2/" rel="nofollow">https://blog.2ndquadrant.com/tablesample-in-postgresql-9-5-2...</a>- EXPLAIN. The Postgres planner/optimizer is still the king at explaining why your query is taking forever and what you can do about it. Admittedly, this stuff quickly gets arcane, but you can post your EXPLAIN output to a forum and guys like me will tell you how to override the JOIN order, update statistics, etc. <a href="https://www.postgresql.org/docs/current/static/using-explain.html" rel="nofollow">https://www.postgresql.org/docs/current/static/using-explain...</a>Finally, no conversation about Postgres vs <x> is complete without mentioning that ALMOST EVERY FEATURE IN POSTGRES WORKS WITH EVERY OTHER FEATURE, which means you don't waste hours investing in something and then "oops" your carefully written user defined function can't be invoked in some obscure place - with Postgres, you can assume everything just works and will keep working. There's few pieces of software that can claim this.

评论 #17136202 未加载

评论 #17140563 未加载

garyclarke27大约 7 年前

Interesting but obviously biased in favour of MySQL. I disagree with some Postgres (which I know v well not MySQL though) comments. Wrong everyone does not use Sequential Integer Primary Keys, because they are completely useless as a Set constraint, I use real world logic. Update is irrelevant to me, so Vacuum not a problem, I use append only ie immutable database, the upside (not mentioned) is that this architecture makes PostgreSQL rock solid, can stay up for years unlike for example MS SQL Server that requires regular reboots and often fails to come backup cleanly. Also provide DDL transactions with rollback. I’m curious does MySQL now have Check Constraints, Arrays, Drop Schema with cascade (a huge timesaver), V8 PL, Lateral Joins, Range Types, Custom Aggregates (so powerful) SQL Function inlining, Parallel Query, Functional Indexes, Exclusion Constraints (eg no period overlap), Notify/Listen, Foreign Data Wrappers?

mbell大约 7 年前

The connection comments are a bit dubious. MySQL will use less memory for 1000 connections but performance will still drop due to contention and context switching. In both systems you want a small number of connections to the actual database, something on the order of 1-2x cpu cores usually, and something on top pooling client connections if you need a lot of them, pgbouncer or the equivalent for MySQL.

评论 #17136108 未加载

foxylion大约 7 年前

The article contains a footnote about UUIDs as primary keys.> UUID as a primary key is a terrible idea, by the way — cryptographic randomness is utterly designed to kill locality of reference, hence the performance penaltyIs there anyone who can go a little bit more in detail?We planned to migrate our database to use UUIDs as primary keys. This will allow creating new rows on clients knowing the new primary key before sending them to the server (simplifying client and server code).

评论 #17135616 未加载

评论 #17135847 未加载

评论 #17135613 未加载

评论 #17135684 未加载

评论 #17135629 未加载

评论 #17135726 未加载

评论 #17136029 未加载

评论 #17136017 未加载

评论 #17138150 未加载

评论 #17136182 未加载

评论 #17135520 未加载

parvenu74大约 7 年前

I like that PostgreSQL can have both relational table and JSONB document collections (NoSQL) in the same database. Use NoSQL where it makes sense and relational tables for data that is inherently relational and query and join both (or batch-process from one to the other). I find this very cool.Of course I wonder if it's too much cool and in trying to do everything it's falling short in some significant and fundamental way.

评论 #17136027 未加载

评论 #17136984 未加载

hodgesrm大约 7 年前

> [2] When I say Postgres is great for analytics, I mean it. In case you don’t know about TimescaleDB, it’s a wrapper on top of PostgreSQL that allows you to INSERT 1 million records per second, 100+ billion rows per server. Crazy stuff. No wonder why Amazon chose PostgreSQL as its base for Redshift.Correction: Amazon chose ParAccel, which was a data warehouse forked from PostgreSQL.Many data warehouse products have followed this path due to licensing. MySQL is GPLv2 which means you can't ship derivative works without releasing your code. PostgreSQL has a permissive license similar to MIT/BSD. You can do anything you want with the code. That's still a major consideration which the article omitted.(Cross-posted from another HN link to same article.)

评论 #17135919 未加载

评论 #17142734 未加载

saosebastiao大约 7 年前

Referenced in the article:<a href="https://wiki.postgresql.org/wiki/Future_of_storage" rel="nofollow">https://wiki.postgresql.org/wiki/Future_of_storage</a>This reads like all my dreams come true.

sufehmi大约 7 年前

What about upgrading to a newer version of PostgreSQL ? Does that still require upgrading the whole databases ?I evaluated PostgreSQL several times in the past, and cancelled once I found out that upgrading to a new version requires upgrading the whole databases - our databases are too big and our uptime requirement are too strict, we can not afford it

评论 #17140686 未加载

cat199大约 7 年前

0 mention of language or datatype extensibility, which is huge.

cat199大约 7 年前

this states all of the drawbacks to process vs thread but none of the benefits (resiliency / compartmentalization of errors, less need for lock coordination and less risk of locking related bottlenecks with scale, somewhat better host OS CPU & IO utilization, etc. )

dspillett大约 7 年前

> With a clustered index, when you look up a record by the primary keyTo nit-pic - it might be the case in mySQL but some DBs (SQL Server for instance) allow the clustering key to be something other than the primary key, and for some analytical workloads this can be much more efficient.

dspillett大约 7 年前

> UUID as a primary key is a terrible idea, by the wayUnless you use a v1 UUID, i.e. via NEWSEQUENTIALID() in SQL Server. IIRC postgres has an equivelant available as a standard module.

patrickg_zill大约 7 年前

I am pretty sure that PG has had clustered indexes for a decade or more... ? e.g. <a href="https://www.postgresql.org/docs/9.1/static/sql-cluster.html" rel="nofollow">https://www.postgresql.org/docs/9.1/static/sql-cluster.html</a>Or is this term referring to a different feature/method than this?One thing not mentioned: PL/SQL vs. whatever the MySQL equivalent is.

评论 #17136381 未加载

评论 #17136401 未加载

评论 #17136081 未加载

13 条评论

asah大约 7 年前

评论 #17136202 未加载

评论 #17140563 未加载

garyclarke27大约 7 年前

mbell大约 7 年前

评论 #17136108 未加载

foxylion大约 7 年前

评论 #17135616 未加载

评论 #17135847 未加载

评论 #17135613 未加载

评论 #17135684 未加载

评论 #17135629 未加载

评论 #17135726 未加载

评论 #17136029 未加载

评论 #17136017 未加载

评论 #17138150 未加载

评论 #17136182 未加载

评论 #17135520 未加载

parvenu74大约 7 年前

评论 #17136027 未加载

评论 #17136984 未加载

hodgesrm大约 7 年前

评论 #17135919 未加载

评论 #17142734 未加载

saosebastiao大约 7 年前

sufehmi大约 7 年前

评论 #17140686 未加载

cat199大约 7 年前

0 mention of language or datatype extensibility, which is huge.

cat199大约 7 年前

dspillett大约 7 年前

> UUID as a primary key is a terrible idea, by the wayUnless you use a v1 UUID, i.e. via NEWSEQUENTIALID() in SQL Server. IIRC postgres has an equivelant available as a standard module.

patrickg_zill大约 7 年前

评论 #17136381 未加载

评论 #17136401 未加载

评论 #17136081 未加载