First, he mentions that VARCHARs are stored in-row, and then he wonders why his server's performance goes down when he joins on a VARCHAR(30000)? And uses EXPLAIN as the very last step? I don't quite get it, honestly.
This is why I like VARCHAR(MAX) in SQL Server if the row is < 8192 chars it's a VARCHAR if it's > 8192 it's TEXT. You only pay the penalty on rows that exceed that threshold. Best of all you can store 2GB of text in a VARCHAR(MAX)<p><a href="http://msdn.microsoft.com/en-us/library/ms176089.aspx" rel="nofollow">http://msdn.microsoft.com/en-us/library/ms176089.aspx</a>
I'm confused; in the example, couldn't he avoid the filesort entirely by just adding an key on t2.t1id? Is it really optimal to be sorting the whole table -- no matter how optimally -- on any common serving path?
To be clear, it's not just "when a TEXT/BLOB is included in a sort" that causes on-disk temp tables. One would expect to take a hit when sorting on a TEXT field.<p>But actually it's much worse than that: you can get on-disk temp tables if you're selecting a TEXT field while doing an ORDER BY or GROUP BY that uses any columns from the second table in a join.
Learned a lot from that, very fascinating ... would have liked to see benchmarks after he made his adjustsments and a comparison with an innodb version of the database.