I just want to point this out because I feel like there's a good chance a lot of people won't have gotten this far:<p><i>Because our implementation does not explicitly depend on Python we are able to overcome many of the shortcomings of the Python runtime such as running without the GIL and utilising real threads to dispatch custom Numba kernels running at near C speed without the performance limitations of Python.</i>
Bit of a tangent, but I'm wondering if anyone here has had any luck with Cython?<p>I'm starting to run into some performance bottlenecks with Python, and so I'm just now looking at Cython, PyPy, Psyco, and... gasp... C.<p>From what little I've read, Cython is supposed to be as easy as adding some typing and modifying a few loops here and there, and you are in business.
So is there anyone using Python for machine learning in production systems (i.e. not just for prototyping). I would love to do it but seems Java/Mahout is a safer choice, performance-wise.<p>I wonder whether Blaze is a step towards that direction.
It would be great to eventually have a GPU version as well (as in the cases of Matlab and R). I saw a brief demo of Matlab on a Mac Retina Pro 15 where the GPU version ran 30x the CPU version.
I read about continuum after the fellow who developed numpy left a few days ago to work on continuum. I am curious to see actual projects using continuum. So some sort of writeups.
how does this compare to theano? it seems like some of the ideas are similar?<p><a href="http://deeplearning.net/software/theano/" rel="nofollow">http://deeplearning.net/software/theano/</a><p>in general, i like (ie i don't see a better solution than) the idea of having an AST constructed via an embedded language that is implemented by a library. but it does have downsides - integration with other python features is going to be much more limited (it gives the <i>illusion</i> of a python solution, but in practice you're off in some other world that only looks like python).<p>are there more details? i guess the AST is fed to something that does the work. and that something will have an API and be replaceable. but is that something also composable? does it have, say, a part related to moving data and another to evaluating data? so that you can combine "distributed across local machines" with "evaluate on GPU"?