The two things missing from Cascalog that would take it from great to godlike are 1) an easy way to use the distributed cache and 2) a way to run Cascalog jobs on the cluster without the compilation/hadoop jar cycle. I don't know if #2 is even possible but it would be ridiculously powerful.