The key takeaway for me was this:<p>"...the server goes to 288GB (16GB x 18 DIMMs)…so why not? For less than $3,000 we can take this server from 6x16GB to 18x16GB and just not worry about memory for the life of the server. This also has the advantage of balancing all 3 memory channels on both processors, but that’s secondary. Do we feel silly putting that much memory in a single server? Yes, we do…but it’s so cheap compared to say a single SQL Server license that it seems silly not to do it."<p>Clearly - if you are paying 10s of thousands of dollars for a database server License, it makes sense to fully utilize the ones you've purchased.<p>Also, in my experience, databases these days pretty much have to stay in memory in order to be performant whatsoever. I think the rule I heard from a FaceBook DevOps manager who was interviewing with us was "If you touch it once a day, it goes into SSDs, if you touch it once an hour, it goes into memory." - of course, at a certain size, you _have_ to scale horizontally with those in-memory databases as well.
The scale of the problem does not seem all that difficult to overcome (as indicated by the author). What I am interested / pleased to read is that a fairly popular, high traffic site is backed by a "plain old" RDBMS. MSSQL even.
After all of their evangelism about Windows servers, it is rich to see this: "but it’s so cheap compared to say a single SQL Server license that it seems silly not to do it."<p>Yes, that is the problem isn't it?
I was surprised at how small their database is!<p>We (Defensio) store about 300gb of data, and most if it is purged after 60 days. We're quite far from being 274th largest website in the world as well, I assume.<p>It's just very interesting to see how such a huge website can use so little storage.
I have 1.7TB of FusionIO PCI SSD at work and it's not even doing that much right now.<p>These guys should open their wallets and get more than a little bit of SSD, prob PCIe, plus max out on RAM. 96GB is what most would <i>start</i> at now in a larger HP server.
It seems truly bizarre to me that in 2012, there is a major operation that is still trying to vertically scale such a trivially shardable DB.<p>Scaling a static Q/A site with a few widgets that require user info/counters should be table stakes.
It is very intriguing as a front-end guy to see what server admins have to do. As a front-end guy, while I have to think about performance, I never have to be afraid about running out of space or random disk I/O. What do front-end founders do when they unexpectedly run into traffic spikes? I'm so glad services like Heroku take that from me.
Wait, SO is all on one single node? Or are there reverse proxies?<p>I guess the static content is CDN but all dynamic is coming from one machine?<p>Oh wait, nevermind (10 Dell R610 IIS web servers)<p><a href="http://highscalability.com/blog/2011/3/3/stack-overflow-architecture-update-now-at-95-million-page-vi.html" rel="nofollow">http://highscalability.com/blog/2011/3/3/stack-overflow-arch...</a>
Does anyone know if there's any specific reason why they have multiple databases on the same box? From what I see, one could trivially install the full text search engine on another server and reduce much of the space requirements and contention.
Is it not worrisome that they only have one db server with no hot-backup or fail-over db machine? I suppose many components on that one machine are redundant - disks, cpu's... but theres got to be many points of failure in there as well right?
If you're 27th largest site in the world, you shouldn't have any trouble getting large fast SSDs. Or just get a bunch of<p><pre><code> Crucial M4 256 GB (4KB Random Write: Up to 50,000 IOPS)
or
Plextor M3 Series PX-256M3 256GB (4KB Random Write: Up to 65,000 IOPS)
</code></pre>
Plus your whole site is perfect for sharding. Questions are pretty much independent.