These guys appear to be having a hell of a lot of fun. Their technique of wrapping a stack from Amazon EC2 through Hadoop all the way up into Clojure, was the kind of thing I wondered about being possible, so it's pretty awesome to hear it is being done and done well by someone. The idea of iterating with Clojure in a REPL on a small dataset to develop or refine an algorithm, then pressing a button and see how it does running on some large dataset on EC2, sounds sublime.<p>Even if they never release any of the glue code that makes all this happen, just knowing it is possible is very encouraging.
An “in the trenches” interview on building a machine learning application with Rails & Hadoop. During the interview on FlightCaster, Brad describes some of the challenges of working with flight data, statistical approaches for flight prediction, false negatives in FlightCaster, Clojure, Hadoop & Amazon EC2, YCombinator, and more. Was pleasantly surprised at how open Brad was about the model internals and data crunching pitfalls.
In the article, it mentions Bradford's Amazon wishlist.<p>For the curious: <a href="http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I" rel="nofollow">http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I</a><p>And those that have been purchased (a better list): <a href="http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I?reveal=purchased" rel="nofollow">http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I?rev...</a>
It's great to see someone taking piles of data and pulling some meaning out of it. I think it's easy enough these days to be able to see the potential of data-mining a site like Facebook, but I expect a lot of value to come out of sites like FlightCaster that are getting value in domains that folks don't normally think of being data-intensive. Google was more or less data-mining, but it was a relatively easy set of data to access: public websites. Now, we're seeing the exploitation of more obscure, but not necessarily less valuable, data.
An informative interview providing better understanding of the amazing work this team has done. A well balanced team with great creativity, energy, dedication, perseverance not to mention their awesome talent. Great team work resulting in a quality product. You guys rock!