Remember 10 years ago when this was a solved problem? There were only two rules: "always use stored procedures" and "never build dynamic SQL". Doing stuff on relational databases was fast.<p>Granted, you often had to actually write Stored Procedures by hand back then. But then if you look at the LINQ in the article, you'll notice that it's pretty much exactly the SQL you'd stick into your stored procedure, just backwards. Had they not bothered introducing LINQ in the first place, they'd have had their 100x performance boost from the get go.<p>Naturally, SP's are not some magic 100x'ing cure-all. They're just a non-generated version of your SQL, which means that you're guaranteed never to have your ORM go nuts and build some monstrosity like the one outlined in the article.<p>You still need to tune your SQL by hand, but at least you can tune <i>the SQL</i> rather than some not-particularly-helpful abstraction on top of it.
The linked article from Sam's blog: <a href="http://samsaffron.com/archive/2011/05/02/A+day+in+the+life+of+a+slow+page+at+Stack+Overflow#" rel="nofollow">http://samsaffron.com/archive/2011/05/02/A+day+in+the+life+o...</a><p>I found it to be much more interesting and a cool look at how he went about actually finding the slowdown and the steps he took to fix it.
If you are building anything more complex than a blog site and expect to take a decent amount of traffic, to the point that you may in fact care about optimizing at all, going with an ORM that writes sql for you is a really really bad idea. I really don't understand the fascination with ORMs today. Some sort of sql-to-object translation layer is no doubt a great thing, but any time you write "sql" in a non-sql language like python or ruby you are letting go of any ability to optimize your queries. For reasonably complicated and trafficked websites that's a disaster simply waiting to happen. This isn't just blind speculation on my part, I've heard a great many stories where very significant resources had to be dedicated to removing ORM from the architecture, and the twitter example should familiar to most.<p>I would go so far as to say that sql writing ORMs are a deeply misguided engineering idea in and of itself, not just badly implemented in its current incarnations. You can't possibly write data access logic entirely in your front end and expect some system to magically create and query a data store for you in the best or even close to the best way.<p>I think the real reason people use ORMs is because they don't have someone at the company that can actually competently operate a sql database, and at any company of a decent size traffic-wise that's simply a fatal mistake. Unless you are going 100% nosql, at which point this discussion is irrelevant.
The term "NoSQLite" shouldn't be used, or at the very least it should be hyphenated "NoSQL-ite." The existence of a little database engine called SQLite makes this term more than a little confusing.
This seems like a fairly common issue with frameworks that convert things to SQL behind the scenes. If you aren't paying attention, it will run a query for each object, instead of one for the whole page.
Think of it the other way around, their engineers didn't optimize any queries and they don't have a dba watching for table scans or nested loop joins. To the amateur crowd it looks like "wow optimization" but if you've been developing seriously for your career this should look like "you have a lot to learn".
I don't think Stack Overflow's case is a case of "see, you can scale SQL databases". I think it's more of a case of "they've forced themselves to HAVE to scale SQL databases by tying themselves to .Net and MS SQL Server".<p>There's nothing wrong with relational databases, it's just that many of the NoSQL databases are newer, encourage denormalization, and implement things like intelligent sharding and mapreduce out of the box.<p>Which one you prefer seems to depend on whether you're more systems-oriented or code-oriented (if you have someone dedicated to tuning your database, why not just let them worry about it?), or whether you're just forced to use one by virtue of the platform you're using.
They turned on indexing on the table and went to a JOIN type query. Boy. Trouble believing steps 1 - 5 were necessary to come to such a rudimentary fix.
<i>The N+1 problem was fixed by changing the ViewModel to use a left join which pulled all the records in one query.</i><p>This is a classic ORM problem. The usual solution: apply a Declarative Batch Query. I've even written one of those things! The reason why you might want to code a Declarative Batch Query option for your Object Relational framework? You can then apply it to similar problem queries in a few minutes total and one line of code.<p>The first operation I optimized went from 2500 SQL calls to just 40, which is an over 50X speedup. And yes, the entire declarative mechanism was written in about a week to satisfy a routine "Change Request," and yes, it ran in production on a heavily used trading application at a major multinational.
I see 10x speedup was given by an index. A while back I was helping a customer with some serious performance issues, and I pulled together a few procs that look at e.g. indexes, most-locked-objects/whatever over the whole DB. I've seen this below trick referenced many times but the source article seems to be here:<p><a href="http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx" rel="nofollow">http://blogs.msdn.com/b/bartd/archive/2007/07/19/are-you-usi...</a><p>It gives an indication of the highest-impact indexes that are missing. Obviously there's room for experts to tweak, but for most it's an excellent quick tool. Don't create every index it suggests, obviously, just use it as a guide to look for hotspots. It's nice because often it'll surprise you with areas you had no idea were an issue. And it takes a few seconds to run.
"__A NoSQLite might counter that this is what key-value database is for. All that data could have been retrieved in one get, no tuning, problem solved. The counter is then you lose all the benefits of a relational database and it can be shown that the original was fast enough and could be made very fast through a simple turning process, so there is no reason to go NoSQL.__"<p>Moral of Story:
Relational database should be used in most cases. NoSQL is overated & extra-hyped.
There were a number of tools involved in the analysis, including code examination.<p>Can this loop be closed and automated? Why do we all keep having to do this manually?
Imagine an ORM causing a speed issue.<p><i>Performing a code review the code uses a LINQ-2-SQL multi join. LINQ-2-SQL takes a high level ORM description and generates SQL code from it. The generated code was slow and cost a 10x slowdown in production.</i>
They looked at SQL and added an index to their DB?<p>When I read this I actually cringed, because they changed their SQL query in production without looking at the query plan first. Apparently it's the blind leading the blind over at stack overflow, not that it seems to be hurting them.