Handling Growth with Postgres

305 pointsby mikeykover 12 years ago

17 comments

ozover 12 years ago

On a somewhat meta note:>Over the last two and a half years, we’ve picked up a few tips and tools about scaling Postgres that we wanted to share—things we wish we knew when we first launched Instagram.A common failure mode for myself and, I suspect, others, is thinking that we have to know every single thing before we start. Good old geek perfectionism of wanting an ideal, elegant setup. A sort of Platonic Ideal, if you will.These guys went on to build one of the hottest web properties on earth, and they didn't get it all right up front.If you're postponing something because you think you need to master all the intricacies of EC2, Postgres, Rails or $Technology_Name, pay close attention to this example. While they were launching growing, and being acquired for a cool billion, were you agonizing over the perfect hba.conf?More a note to myself than anything else :)

评论 #5076645 未加载

评论 #5076925 未加载

评论 #5076456 未加载

评论 #5077303 未加载

评论 #5076412 未加载

calinet6over 12 years ago

One of the largest services on the net, and their summary of their database experience is "Overall, we’ve been very happy with Postgres’ performance and reliability."Go Bears. That's awesome. And we should all take a hint...

评论 #5075656 未加载

评论 #5075834 未加载

评论 #5075713 未加载

jpitzover 12 years ago

If high-performance PostgreSQL is critical to your job, here are some resources:<a href="http://wiki.postgresql.org/wiki/Slow_Query_Questions" rel="nofollow">http://wiki.postgresql.org/wiki/Slow_Query_Questions</a>Query analysis tool <a href="http://explain.depesz.com" rel="nofollow">http://explain.depesz.com</a>The mailing list <a href="http://www.postgresql.org/list/pgsql-performance/" rel="nofollow">http://www.postgresql.org/list/pgsql-performance/</a>Greg Smith's book <a href="http://www.amazon.com/PostgreSQL-High-Performance-Gregory-Smith/dp/184951030X" rel="nofollow">http://www.amazon.com/PostgreSQL-High-Performance-Gregory-Sm...</a>#postgresql on freenode.net

评论 #5077145 未加载

eltetoover 12 years ago

I am really glad to see all the adoption and recognition that Postgres is receiving nowadays, there was a time when all you could hear about was MySQL (or maybe this is just my perception). It seems to me that it has picked up even more after the Oracle takeover of MySQL, but it could also be that their feature set has reached a pretty mature point, or maybe a combination of both.

评论 #5076813 未加载

评论 #5075414 未加载

评论 #5075907 未加载

atsaloliover 12 years ago

Read scalability has been greatly improved in Postgres 9.2. It scales pretty much linearly to 64 concurrent clients. Goes up to 350,000 queries per second!Write throughput has been improved as well.Check out Josh Berkus's (one of 7 core team members of the Postgres dev team) presentation on what's new in Postgres 9.2:<a href="http://developer.postgresql.org/~josh/releases/9.2/92_grand_prix.pdf" rel="nofollow">http://developer.postgresql.org/~josh/releases/9.2/92_grand_...</a>

joevandykover 12 years ago

Dear Instagram,How do you deploy database updates? With Rails-style migrations?One thing that bugs me about migrations is that if you use functions or views, the function/view definition has to be copied to a new file. It makes it difficult to see what's been changed. I'm looking forward to <a href="http://sqitch.org/" rel="nofollow">http://sqitch.org/</a> for this reason. (slides: <a href="http://www.slideshare.net/justatheory/sqitch-pgconsimple-sql-change-management-with-sqitch" rel="nofollow">http://www.slideshare.net/justatheory/sqitch-pgconsimple-sql...</a>)

评论 #5075553 未加载

hcarvalhoalvesover 12 years ago

The partial index tip is great for that kind of problem (you need a fast query for a subset of your data).

评论 #5075416 未加载

评论 #5075451 未加载

gfodorover 12 years ago

I enjoyed this article and also found a link to this one which I found equally interesting:<a href="http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram" rel="nofollow">http://instagram-engineering.tumblr.com/post/10853187575/sha...</a>I worked with the Flickr-style ticket DB id setup at Etsy, and while it was lovely once it was all set up, it's way more complicated (requiring two dedicated servers and a lot of software and operations stuff.) The solution outlined by instagram of just having a clever schema layout and stored procedure that safely allocates IDs locally on each logical shard is elegant and I'm having a hard time blowing holes in it.

评论 #5075637 未加载

hosay123over 12 years ago

> we’re now pushing over 10,000 likes per second at peakThese kinds of stats always sound so impressive, but let's imagine:<pre><code> - 8 byte timestamp - 8 byte user ID - 8 byte post ID - 128 bytes DBMS overhead - 128 bytes for user->like index - 128 bytes for post->like index </code></pre> = 3.96MiB/second, or ~1015 IOPs/second, or 342GB per day absolute worst case. A single economy machine with an even remotely decent SSD could handle a full day's data at these rates.

评论 #5079671 未加载

评论 #5078622 未加载

andrewljohnsonover 12 years ago

We use a lot of PostGIS via GeoDjango, and I made a mental note to remember this article if I my postgres instances ever start ailing. Unfortunately, we haven't pushed these limits nearly as much as Instagram.

评论 #5076293 未加载

zrailover 12 years ago

I don't understand why anyone would advocate for autocommit. It's a horrible feature that leads to broken data.

评论 #5075493 未加载

评论 #5075424 未加载

评论 #5075487 未加载

评论 #5076411 未加载

评论 #5075428 未加载

评论 #5075407 未加载

评论 #5075360 未加载

yRetsyMover 12 years ago

Does anyone know if Heroku does any of this sort of "here's what we learned" posts re: postgres?

评论 #5076334 未加载

mbellover 12 years ago

That autocommit point just made thousands of Java EE/hibernate guys cry out in terror.

评论 #5078061 未加载

mixmastamykover 12 years ago

Anyone use one of these tools? First I've heard of them, and would seem to make my inner control-freak happy.<pre><code> pg_reorg can re-organize tables without any locks, and can be a better alternative of CLUSTER and VACUUM FULL. This project is not active now, and fork project "pg_repack" takes over its role. See https://github.com/reorg/pg_repack for details.</code></pre>

weaksauceover 12 years ago

Can anyone recommend a good guide to administering Postgres? Best practices etc.... The Postgres docs are fairly light about all that.

willlllover 12 years ago

Good to see people spreading WAL-E love.

dotborgover 12 years ago

they read postgres docs, that's just.. amazing:)

评论 #5075585 未加载