(insert standard disclaimer about being responsible for CUDA here)<p>I'm definitely happy to see more languages with GPU support, but schedulers to distribute work between CPUs and GPUs are a particular interest of mine. The most full-featured I've seen is StarPU:<p><a href="http://runtime.bordeaux.inria.fr/StarPU/" rel="nofollow">http://runtime.bordeaux.inria.fr/StarPU/</a><p>But there's still a lot of work to be done; it would be very interesting to remove the need for the developer to estimate time spent on CPU (or one type of processor) versus time spent on GPU and see the effects on developer productivity, for example.
My final year thesis was on implementing some algorithms with Accelerate, and one of the things I noted was that on a 2009 Macbook Pro (256 megabyte integrated Nvidia GPU), a single threaded C program runs faster than using Accelerate, even when all it does is multiplying each element of an array by two. The performance discrepancy is even greater for more complicated problems. So, before you jump in to use this and expect better performance on embarrassingly parallel problems, make sure your Nvida GPU is not integrated and has lots of memory.<p>Of course this new package is different because it uses both CPU/GPU...<p>I also found Accelerate programs hard to debug. You cannot use "trace" to print out stuff during computation because that is a CPU instruction.
My wife worked on this a bit with Adam (<a href="https://twitter.com/#!/acfoltzer" rel="nofollow">https://twitter.com/#!/acfoltzer</a>) and Ryan. There is a pending submission to ICFP.<p>The reason they went with CUDA was to plug into Accelerate's existing framework without redeveloping the entire wheel. As meric mentioned, Accelerate is a pain to do anything with and you can bet dollars to do syntax that this package will generate the hard parts for you.<p>IIRC, ParFunk also has some nice framework in place for distributed computation (though I'm not certain it's completely in working order yet).
That's really cool.<p>I also like how the blog post is available as a literate Haskell file. I think that's a great way to make an introduction more useful, and I wish more languages would take an approach like that for different articles.