TechEcho

8 comments

sontekalmost 2 years ago

I have more experience running large MySQL databases. Both at SurveyMonkey and Zapier our primary databases were MySQL and were massive as you can imagine from massively scaled consumer products. I won't list the data here since this is about postgres but wanted to provide the context.The largest postgres I've personally administered is on RDS:<pre><code> * postgres 13.4 * db.m6g.2xlarge (8vCPU, 32 GB RAM) * 1.3tb storage * Largest tables: 317GB, 144GB, and 142GB </code></pre> I effectively treat this database as an analytics database and write massive queries with joins against those largest tables and postgres handles it just fine.I have had random bad query plans when using CTEs that get cached and take down the entire DB for a bit but usually been able to fix it with `vacuum analyze` and the customizing the query to convince postgres to choose a better plan.Overall I think postgres can do a lot of work with very little.

评论 #36124569 未加载

评论 #36127489 未加载

评论 #36125058 未加载

评论 #36129745 未加载

AlexITCalmost 2 years ago

A few years ago, I used to run and maintain a block explorer that indexed Bitcoin mainnet, in the end, this turned out to be a ~2 TB database which was too expensive to host on any managed database, hence, I ended up running it on the same server that exposed the API.I remember when I had to run this for the first time, I faced a few challenges, for example:- I usually don't care about optimizing the db schema because postgres can handle most projects without much effort, this wasn't the case anymore, there were indexes that I had to drop because these were causing inserts to become considerably slower and they took a few GB to store them.- Column types started to matter, I had some columns that were stored as hex-strings but I ended up switching to BYTEA to save some space.- The reads were slow until I updated the postgres settings, postgres default settings are very conservative and while they can work for many projects, those won't work when you have TB of data.- While this isn't database specific, offset-pagination does not work anymore and I switched all of these queries to scroll-based pagination.- Applying database migrations isn't trivial anymore because the some operations could lock the database for hours, for example, updating a column type from TEXT to BYTEA isn't an option if you want to avoid many hours of downtime, instead, you have to create a new column and migrate the rows in the background, once the migration is ready, drop the old column.Overall, it was a fun journey that required many tries to get a decent performance. There are some other details to consider but it's been a few years since I did this.

_boffin_almost 2 years ago

I have a +1tb Postgres database at home that has market data in it. Moving it over to another Postgres db that has timeseries extensions to compact and increase performance. Also thinking about just partitioning everything up into parquet files. Query performance isn’t the best and am currently trying to improve it.

评论 #36126652 未加载

评论 #36126277 未加载

mrietalmost 2 years ago

<a href="https://www.adyen.com/blog/updating-a-50-terabyte-postgresql-database" rel="nofollow">https://www.adyen.com/blog/updating-a-50-terabyte-postgresql...</a>And that was in _2018_. I wonder how big it is now.

kremboalmost 2 years ago

I'm managing a few databases for living, from pg dbs of few hundreds of GBs on-prem, up to few TBs on rds, and almost 1PB cluster of 20 warehouses on Snowflake. Which of those count? (doing data architect/eng for living)

nunesvnalmost 2 years ago

I really would like to know details on how people backup, dump, migrate and upgrade DBs of this size.

评论 #36125381 未加载

alpineidyll3almost 2 years ago

A family of posts in this genre for every db would be quite a useful resource.

i_have_an_ideaalmost 2 years ago

I have a 3TB pgsql db running on 32 CPUs with 128GB ram

8 comments

sontekalmost 2 years ago

评论 #36124569 未加载

评论 #36127489 未加载

评论 #36125058 未加载

评论 #36129745 未加载

AlexITCalmost 2 years ago

_boffin_almost 2 years ago

评论 #36126652 未加载

评论 #36126277 未加载

mrietalmost 2 years ago

kremboalmost 2 years ago

nunesvnalmost 2 years ago

I really would like to know details on how people backup, dump, migrate and upgrade DBs of this size.

评论 #36125381 未加载

alpineidyll3almost 2 years ago

A family of posts in this genre for every db would be quite a useful resource.

i_have_an_ideaalmost 2 years ago

I have a 3TB pgsql db running on 32 CPUs with 128GB ram

Ask HN: Largest Postgres DBs?

8 comments

Ask HN: Largest Postgres DBs?

8 comments