You notice how in these recaps, all you read about is "I learned that X does Y"? They don't seem to have much in the way of lessons to take heed of for all situations. It's more like, "If you use this specific key/value store, tweak the thingimabob to sassyfraz to make sure your dingo does wibblydong." So if my platform doesn't use that store, your lesson is pointless. If it's a problem with an application, it's great that you're pointing it out, but if it was just oversight by lazy engineers, leave it out.<p>Then there's the wise lessons on general topics, like the idea that you should "wait until your site grows so you can learn where your scaling problems are going to be". I'm pretty sure we <i>know</i> what your scaling problems are going to be. Every single resource in your platform and the way they are used will eventually pose a scaling problem. Wait until they become a problem, or <i>plan</i> for them to become a problem?<p>I'm not that crazy. It really doesn't take <i>a lot</i> of time to plan ahead. Just think about what you have, take an hour or two and come up with some potential problems. Then sort the problems based on most-imminent most-horrible factors and make a roadmap to fix them. I know nobody likes to take time to reflect before they start banging away, but consider architectural engineering. Without careful planning, the whole building may fall apart. (Granted, nobody's going to die when your site falls apart, but it's a good mindset to be in)
> Stay as schemaless as possible. It makes it easy to add features. All you need to do is add new properties without having to alter tables.<p>And at the same time they use and praise Postgres a lot, so it cannot be about NoSQL.<p>I am wondering what they mean exactly. From my own tendency, it should mean use a few very big and narrow tables in the form of "who - do - what - when - where", eg "userA - vote up - comment1 - timestamp - foosubreddit", and also "userB - posted - link1 - timestamp - barsubreddit"<p>Then in the same table you get kinda all events happening in the site, and you are somewhat schemaless, in the sense that adding a new functionality do not require schema change.<p>If someone with inner insight can confirm this is not too far from what reddit team meant, I'd appreciate.
Reddit is an interesting case; they seem to have almost unlimited amounts of user good will. Case in point: I get the "you broke reddit" pageload failure message an awful lot and I'm sure others do too. How many other sites have userbases that would tolerate such a high number of errors?
> For comments it’s very fast to tell which comments you didn’t vote on, so the negative answers come back quickly.<p>Can you get into more details about how this is used? If reddit needs to display a page that has 100 comments, do they query Cassandra on the voting status of the user on those 100 comments?<p>I thought Cassandra was pretty slow in reads (slower than postgres) so how does using Cassandra make it fast here?
This looks like a summary of the talk on InfoQ on the subject:<p><a href="http://www.infoq.com/presentations/scaling-reddit" rel="nofollow">http://www.infoq.com/presentations/scaling-reddit</a>
"Treat nonlogged in users as second class citizens. By always giving logged out always cached content Akamai bears the brunt for reddit’s traffic. Huge performance improvement. "<p>This is the lowest of low hanging fruit. Many people don't realize it but a ton of huge media sites use Akamai to offload most of their "read-only" traffic.
> Used the Pylons (Django was too slow), a Python based framework, from the start<p>This isn't quite right. It was web.py at the beginning. They have started using Pylons after Conde Nast acquisition.
I can certainly appreciate what Reddit has accomplished, but the thought of losing the abilities of a full RDBMS for a key-value store makes my hair stand on end.<p>I've yet to find schema changes limiting in my ability to code against a DB (and I use MySQL, which is one of the most limiting in this regard). Plus, I appreciate the ability to offload things like data consistancy and relationships to the database. I understand, however, where others might not feel the same way.
<i>Queues were a saviour. When passing work between components put it into a queue. You get a nice little buffer.</i><p>What does reddit use for queuing?
This appears to a summery of an InfoQ presentation, which was discussed about two weeks ago @ <a href="https://news.ycombinator.com/item?id=6222726" rel="nofollow">https://news.ycombinator.com/item?id=6222726</a>
Is it common for people to use PostGres for a key-value store in production (rather than redis)?. This is the first time I have heard of it, and I am just starting to use PostGres now, so I was a bit surprised
Jeremy also gave a great Airbnb tech talk on this topic:<p><a href="http://nerds.airbnb.com/reddit-netflix-and-beyond-building-scalable-and-reliable-architectures-in-the-cloud/" rel="nofollow">http://nerds.airbnb.com/reddit-netflix-and-beyond-building-s...</a>
Can someone elaborate/clarify this:<p><i>> Users connect to a web tier which talks to an application tier.</i><p>So, I'm assuming the web tier is nginx/haproxy and the application tier is Pylons.<p>Are the 240 servers mentioned all running <i>both</i> the web tier and the app tier?
jedberg - you speak of automation, did you use anything (or is there anything in use currently) that handles auto scaling for EC2? puppet/chef/ansible etc? Or was this all done by hand?