Stack Overflow Makes Slow Pages 100x Faster By Simple SQL Tuning

175 pointsby jpmcabout 14 years ago

17 comments

jasonkesterabout 14 years ago

Remember 10 years ago when this was a solved problem? There were only two rules: "always use stored procedures" and "never build dynamic SQL". Doing stuff on relational databases was fast.Granted, you often had to actually write Stored Procedures by hand back then. But then if you look at the LINQ in the article, you'll notice that it's pretty much exactly the SQL you'd stick into your stored procedure, just backwards. Had they not bothered introducing LINQ in the first place, they'd have had their 100x performance boost from the get go.Naturally, SP's are not some magic 100x'ing cure-all. They're just a non-generated version of your SQL, which means that you're guaranteed never to have your ORM go nuts and build some monstrosity like the one outlined in the article.You still need to tune your SQL by hand, but at least you can tune the SQL rather than some not-particularly-helpful abstraction on top of it.

评论 #2510616 未加载

评论 #2510524 未加载

评论 #2510579 未加载

评论 #2510048 未加载

评论 #2510304 未加载

评论 #2510916 未加载

评论 #2510049 未加载

评论 #2510492 未加载

评论 #2510064 未加载

swansonabout 14 years ago

The linked article from Sam's blog: <a href="http://samsaffron.com/archive/2011/05/02/A+day+in+the+life+of+a+slow+page+at+Stack+Overflow#" rel="nofollow">http://samsaffron.com/archive/2011/05/02/A+day+in+the+life+o...</a>I found it to be much more interesting and a cool look at how he went about actually finding the slowdown and the steps he took to fix it.

评论 #2511462 未加载

n_are_qabout 14 years ago

If you are building anything more complex than a blog site and expect to take a decent amount of traffic, to the point that you may in fact care about optimizing at all, going with an ORM that writes sql for you is a really really bad idea. I really don't understand the fascination with ORMs today. Some sort of sql-to-object translation layer is no doubt a great thing, but any time you write "sql" in a non-sql language like python or ruby you are letting go of any ability to optimize your queries. For reasonably complicated and trafficked websites that's a disaster simply waiting to happen. This isn't just blind speculation on my part, I've heard a great many stories where very significant resources had to be dedicated to removing ORM from the architecture, and the twitter example should familiar to most.I would go so far as to say that sql writing ORMs are a deeply misguided engineering idea in and of itself, not just badly implemented in its current incarnations. You can't possibly write data access logic entirely in your front end and expect some system to magically create and query a data store for you in the best or even close to the best way.I think the real reason people use ORMs is because they don't have someone at the company that can actually competently operate a sql database, and at any company of a decent size traffic-wise that's simply a fatal mistake. Unless you are going 100% nosql, at which point this discussion is irrelevant.

评论 #2510454 未加载

评论 #2511316 未加载

评论 #2510697 未加载

评论 #2511543 未加载

评论 #2512165 未加载

评论 #2512208 未加载

评论 #2511467 未加载

评论 #2513384 未加载

baddoxabout 14 years ago

The term "NoSQLite" shouldn't be used, or at the very least it should be hyphenated "NoSQL-ite." The existence of a little database engine called SQLite makes this term more than a little confusing.

评论 #2511296 未加载

评论 #2510838 未加载

smackfuabout 14 years ago

This seems like a fairly common issue with frameworks that convert things to SQL behind the scenes. If you aren't paying attention, it will run a query for each object, instead of one for the whole page.

评论 #2510181 未加载

bobx11about 14 years ago

Think of it the other way around, their engineers didn't optimize any queries and they don't have a dba watching for table scans or nested loop joins. To the amateur crowd it looks like "wow optimization" but if you've been developing seriously for your career this should look like "you have a lot to learn".

评论 #2510976 未加载

评论 #2512617 未加载

jarinabout 14 years ago

I don't think Stack Overflow's case is a case of "see, you can scale SQL databases". I think it's more of a case of "they've forced themselves to HAVE to scale SQL databases by tying themselves to .Net and MS SQL Server".There's nothing wrong with relational databases, it's just that many of the NoSQL databases are newer, encourage denormalization, and implement things like intelligent sharding and mapreduce out of the box.Which one you prefer seems to depend on whether you're more systems-oriented or code-oriented (if you have someone dedicated to tuning your database, why not just let them worry about it?), or whether you're just forced to use one by virtue of the platform you're using.

评论 #2510052 未加载

评论 #2510012 未加载

评论 #2510123 未加载

评论 #2510376 未加载

评论 #2510549 未加载

joriswabout 14 years ago

They turned on indexing on the table and went to a JOIN type query. Boy. Trouble believing steps 1 - 5 were necessary to come to such a rudimentary fix.

stcredzeroabout 14 years ago

The N+1 problem was fixed by changing the ViewModel to use a left join which pulled all the records in one query.This is a classic ORM problem. The usual solution: apply a Declarative Batch Query. I've even written one of those things! The reason why you might want to code a Declarative Batch Query option for your Object Relational framework? You can then apply it to similar problem queries in a few minutes total and one line of code.The first operation I optimized went from 2500 SQL calls to just 40, which is an over 50X speedup. And yes, the entire declarative mechanism was written in about a week to satisfy a routine "Change Request," and yes, it ran in production on a heavily used trading application at a major multinational.

richardwabout 14 years ago

I see 10x speedup was given by an index. A while back I was helping a customer with some serious performance issues, and I pulled together a few procs that look at e.g. indexes, most-locked-objects/whatever over the whole DB. I've seen this below trick referenced many times but the source article seems to be here:<a href="http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx" rel="nofollow">http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-usi...</a>It gives an indication of the highest-impact indexes that are missing. Obviously there's room for experts to tweak, but for most it's an excellent quick tool. Don't create every index it suggests, obviously, just use it as a guide to look for hotspots. It's nice because often it'll surprise you with areas you had no idea were an issue. And it takes a few seconds to run.

gary4garabout 14 years ago

"__A NoSQLite might counter that this is what key-value database is for. All that data could have been retrieved in one get, no tuning, problem solved. The counter is then you lose all the benefits of a relational database and it can be shown that the original was fast enough and could be made very fast through a simple turning process, so there is no reason to go NoSQL.__"Moral of Story: Relational database should be used in most cases. NoSQL is overated & extra-hyped.

JoeAltmaierabout 14 years ago

There were a number of tools involved in the analysis, including code examination.Can this loop be closed and automated? Why do we all keep having to do this manually?

评论 #2510022 未加载

chopsueyarabout 14 years ago

Imagine an ORM causing a speed issue.Performing a code review the code uses a LINQ-2-SQL multi join. LINQ-2-SQL takes a high level ORM description and generates SQL code from it. The generated code was slow and cost a 10x slowdown in production.

ck2about 14 years ago

Wait, it took them how many years to think of doing a sql query analysis?

评论 #2510038 未加载

评论 #2510559 未加载

nikcubabout 14 years ago

This thread and the tips are great, but that page should be cached anywayAwesome that SE have come so far generating every pageview

ltbarcly3about 14 years ago

They looked at SQL and added an index to their DB?When I read this I actually cringed, because they changed their SQL query in production without looking at the query plan first. Apparently it's the blind leading the blind over at stack overflow, not that it seems to be hurting them.

评论 #2510606 未加载

innesabout 14 years ago

Imagine how great StackOverflow would be if all the experts critiquing it here had built it.