Sharding Pinterest: How we scaled our MySQL fleet

144 pointsby jparisealmost 10 years ago

21 comments

devmachalmost 10 years ago

As I read on their blog, Instagram uses similar logic with PostgreSQL. If anyone interested, it's accessible on <a href="http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram" rel="nofollow">http://instagram-engineering.tumblr.com/post/10853187575/sha...</a>

评论 #10088711 未加载

boomzillaalmost 10 years ago

It seems mysql (and hopefully postgresql sometime soon) with custom sharding logic in the app layer still hits the sweet spot for scaling to the order of 100M users. With some thoughts going into designing an appropriate data model and sharding logic, certain join queries can be delegated to the databases too.

评论 #10087311 未加载

philsnowalmost 10 years ago

> To edit a Pin, we read-modify-write the JSON under a MySQL transaction:I've seen a different approach where you keep a version number on the row, do your read, modify in memory on the app server, then do your write like this<pre><code> UPDATE db03429.pins SET blob=’<modified blob>’ WHERE local_id=7075733 AND version=53 </code></pre> then look at the result and make sure that it modified one row. If it returned zero rows, you retry (or show a failure to the user, whichever is appropriate for your use case).The reason you'd do this is so you can't ever have the row locked for a long period of time. A lot of people don't think about database scalability so _even if they know_ that the code they're writing runs while a transaction is held open, they don't care that that transaction is blocking anything else that is trying to read the row they're working on.This can lead to row lock bloat over time, which can cause scalability / availability issues as app servers wait longer and longer to get read (or write) locks on all the rows they care about for their current request. This is mitigated a bit if you're requiring / encouraging people to read from slaves instead of master, though.

philsnowalmost 10 years ago

> We only interact with the master in production. You never want to read/write to a slave in production. Slaves lag, which causes strange bugs.These can be worked through, with discipline. You probably only need a fully consistent few on a surprisingly small number of pages.Generally you'll want to read from the master right after a user changes something (if you have a model where user clicks a button -> you go to a different page). User changes setting on a pin, clicks 'save', you render a new page showing their updated pin. This page view should probably come from the master, or else you risk the user's change not showing up, causing confusion.Reads from slaves are fine, as long as you're not using something that was read out of a slave as an input to a database write somewhere else (which you shouldn't be doing anyway!). If you render a page that lets a user change their profile (say site.com/edit_profile), the user data can come from a slave, but if you take _all_ the field values and blindly write those into the master, that's where you run into "time travel" bugs. You just need to find out what the user changed and only make those changes in the master.

misiti3780almost 10 years ago

Can someone explain to my why people are still recommending mysql over postgres? This is a serious question, it just seems that Postgres has more features and I cant think of any good reasons mysql would scale any differently other than it has been along a little longer (so there are more blog posts + experienced engineers) ?"MySQL is mature, stable and it just works. Not only do we use it, but it’s also used by plenty of other companies pushing even bigger scale. MySQL supports our need for ordering data requests, selecting certain ranges of data and row-level transactions. It has a hell of a lot more features, but we don’t need or use them. But, MySQL is a single box solution, hence the need to shard our data. Here’s our solution:"what about that paragraph is not true of postgres also ?UPDATE:This schemaless json reminds me of this friendfeed blog post from years ago:<a href="https://backchannel.org/blog/friendfeed-schemaless-mysql" rel="nofollow">https://backchannel.org/blog/friendfeed-schemaless-mysql</a>

评论 #10089075 未加载

pbreitalmost 10 years ago

Is it surprising that Pinterest is on MySQL? Has Postgres finally usurped the position of default DB for startups?

评论 #10087688 未加载

z3t4almost 10 years ago

Nice article and good explanations.But I still wonder why some people store blobs in a database rather then simple files on a file-system!?

评论 #10087792 未加载

e40almost 10 years ago

Is this the guy that's going to reddit?

评论 #10088029 未加载

评论 #10087403 未加载

评论 #10087418 未加载

fredliualmost 10 years ago

How are they going to solve the problem of querying into the data that's stored as json? E.g. trying to find all pins whose "link" is from, say, reddit. Just pull out all data and filter them through in client side? That's not gonna scale. Or having a sort of cron job that periodically picking out interested fields in new json data and store them in a separate table? <-- this is essentially what we do in one of our projects, but curious to see how they do it, or alternatives.

评论 #10089091 未加载

评论 #10089007 未加载

ck2almost 10 years ago

PINTEREST: publish your public outgoing IPs so we can whitelist you and block the rest of amazon ec2It is the responsible thing to do, otherwise other website bots can spoof you.

评论 #10096144 未加载

评论 #10088679 未加载

评论 #10088315 未加载

mark242almost 10 years ago

Am I missing the obvious? They're using text columns as blob stores for JSON data? How on earth do you query that in MySQL? How do you run a secondary index, on say the user_id? Is it just one gigantic instance of Elasticsearch or Lucene?

评论 #10102999 未加载

applecorealmost 10 years ago

This reminds me of how FriendFeed was using MySQL to store unstructured data.<a href="http://backchannel.org/blog/friendfeed-schemaless-mysql" rel="nofollow">http://backchannel.org/blog/friendfeed-schemaless-mysql</a>

评论 #10088668 未加载

baghalialmost 10 years ago

I'm curious to know how Pinterest or Instagram decide about what goes inside each shard? Do they shard by user ids or something else? Secondly, would like to know if a shard gets more data than other shard, how do they load balance?

joantunealmost 10 years ago

a minor con of this approach is that you have to add an extra layer to your application to do these operations, i.e. to abstract them. But probably the speed & other gains surpass the cons, and in such a big team, i'm sure they will easily handle such an abstraction layer.One good idea is to open source it, so if other people can take advantage of it, they will also help you maintain it and find bugs for it.A question: If you started this now, would you consider using Postgres-XL AFAIK it supports similar shardings, in a more transparent manner for the developers. Any thoughts on this?

uptownalmost 10 years ago

Can somebody help me understand this conversion?Shard ID = (241294492511762325 >> 46) & 0xFFFF = 3429Type ID = (241294492511762325 >> 36) & 0x3FF = 1Local ID = (241294492511762325 >> 0) & 0xFFFFFFFFF = 7075733

评论 #10087539 未加载

评论 #10087407 未加载

thewarrioralmost 10 years ago

I know nothing about scaling databases but this reminds of Amazon's Dynamo DB object store but without the consistent hashing trick .

评论 #10088038 未加载

silverlakealmost 10 years ago

Isn't it unsafe to expose internal database IDs to external clients? I would have generated a second GUID for public view.

评论 #10089212 未加载

评论 #10088381 未加载

tapirlalmost 10 years ago

Too many configs and maintenance.NoSQL is really better than SQL to scale your databases. It is just sad there is no one open source NoSQL db as good as Google BitTable.

评论 #10091283 未加载

loco5nineralmost 10 years ago

> Shard. Or shard not. There is no try.sorry, that was bugging me...

pferrelalmost 10 years ago

How embarrassing for them, conquering yesterday's tech today!

halaylialmost 10 years ago

This looks like a big hack to compensate for using the wrong tool. Cassandra would have been a better solution IMO.With Cassandra, you can set replication factors, speed up the writes, and automatically shard the data without having to manage your own "mapping tables".

评论 #10087399 未加载

评论 #10087815 未加载

评论 #10087804 未加载

评论 #10090210 未加载

评论 #10088247 未加载

21 comments

devmachalmost 10 years ago

评论 #10088711 未加载

boomzillaalmost 10 years ago

评论 #10087311 未加载

philsnowalmost 10 years ago

misiti3780almost 10 years ago

评论 #10089075 未加载

pbreitalmost 10 years ago

Is it surprising that Pinterest is on MySQL? Has Postgres finally usurped the position of default DB for startups?

评论 #10087688 未加载

z3t4almost 10 years ago

Nice article and good explanations.But I still wonder why some people store blobs in a database rather then simple files on a file-system!?

评论 #10087792 未加载

e40almost 10 years ago

Is this the guy that's going to reddit?

评论 #10088029 未加载

评论 #10087403 未加载

评论 #10087418 未加载

fredliualmost 10 years ago

评论 #10089091 未加载

评论 #10089007 未加载

ck2almost 10 years ago

PINTEREST: publish your public outgoing IPs so we can whitelist you and block the rest of amazon ec2It is the responsible thing to do, otherwise other website bots can spoof you.

评论 #10096144 未加载

评论 #10088679 未加载

评论 #10088315 未加载

mark242almost 10 years ago

评论 #10102999 未加载

applecorealmost 10 years ago

评论 #10088668 未加载

baghalialmost 10 years ago

joantunealmost 10 years ago

uptownalmost 10 years ago

评论 #10087539 未加载

评论 #10087407 未加载

thewarrioralmost 10 years ago

I know nothing about scaling databases but this reminds of Amazon's Dynamo DB object store but without the consistent hashing trick .

评论 #10088038 未加载

silverlakealmost 10 years ago

Isn't it unsafe to expose internal database IDs to external clients? I would have generated a second GUID for public view.

评论 #10089212 未加载

评论 #10088381 未加载

tapirlalmost 10 years ago

Too many configs and maintenance.NoSQL is really better than SQL to scale your databases. It is just sad there is no one open source NoSQL db as good as Google BitTable.