I'm confused by the assertion that Hive was "much slower than using MySQL with the same dataset." The author makes this claim, and then provides a table that shows Hive performing ~50% better than MySQL on a variety of datasets (none of which really flex the muscle of Hadoop in operating on data sets going beyond single digit GB).<p>Regardless, Impala sounds like it could be pretty sweet!