Mahout is most mature in the recommendation area. In that arena, it has wider and wider adoption. AOL uses it. Even Amazon is rumored to be about to use it for some purposes. Apple has repurposed some Mahout code for Genius. Dozens of other companies use it for recommendation.<p>In the more traditional data-mining areas of clustering, latent variable discovery and supervised classification, Mahout does scale and does deployment very well. I am a Mahout committer, but I use R all the time, often times for prototyping or for small analyses. I would hate to have to deploy an R solution, however. Sampling is a fine solution for the first 80% of gain and if you are in a startup situation, that may well be enough for you. On the other hand, efficient deployment is usually pretty important as well.<p>As always, you mileage may vary.<p>I would recommend that you pop over to the Mahout mailing list for better feedback. I doubt that the Hacker News community knows as much about Mahout as the people who develop and use Mahout.
Apache Mahout is interesting but I still haven't found a strong need to use it. Most of the time I can sample the data that I process in Hadoop, and use R for training machine learning algorithms.<p>Does any body know of any large scale data mining use of Apache Mahout?