He's missing Cython, which is another good option when you're looking for speed.<p>My personal favourite optimisation, from needing to shave a few milliseconds off our API response times, was discovering that it's measurably slower to use * args and * *kwargs, and switching to explicitly declaring and passing arguments in the relevant parts of the code.<p>We also did a few other neat things:<p>- Rolled our own UUID-like generator in pure Python (I was surprised this helped, but the profiler doesn't lie)<p>- Switched to working directly with WebOb Request and Response objects rather than using a framework<p>- Used a background thread with a single slot queue to make sure our response was returned to the user before we emitted the event log message, but always emit the message before moving to the next request<p>- Heavy optimisation of memcache / redis reads and writes<p>Edit: Fixed formatting
The order of tactics to take is wrong. In terms of energy expended, one should use PyPy first! It is amazingly compatible with CPython and can now be embedded directly in CPython programs, <a href="https://github.com/fijal/jitpy" rel="nofollow">https://github.com/fijal/jitpy</a> (supports numpy arrays)<p>Dump your virtualenv, create a new one with pypy, reinstall libraries and test your app. Takes less than 20 minutes, even for complex applications.
Serious question: if you have some code that really has to be fast, is it viable to keep it in Python, or should you ultimately end up rewriting it in a compiled language?<p>For example, I am writing code that implements networks that evolve over time for AI research. Prototyping it in Python makes it easy to test things out, but I expect that I will have to rewrite it in C++ or maybe something more fun, like Haskell[1].<p>1. Mostly for the sheer joy of trolling my colleagues with a learning agent monad.