The article has some embarrassing errors, and its advice is not going to make your Python programs blazingly fast, but it's a good start.<p>Resuming a generator in CPython is a lot faster than creating a whole new function call, and especially a whole new method call, contrary to what the article said. But often enough it's faster to just eagerly materialize a list result.<p>Some other good tips: %timeit, ^C, sort -nk3, Numpy, Pandas, _sre, PyPy, native code. In more detail:<p>• For benchmarking, use %timeit in IPython. It's much easier and much more precise than time(1). For super lazy benchmarking use %%time instead.<p>• The laziest profiler is to interrupt your program with ^C. If you do this twice and get the same stack trace, it's a good bet that's where your hotspot is. cProfile is better, at least for single-threaded programs. Others here suggest line_profiler.<p>• If you have output from the profile or cProfile module saved in a file, you can use the pstats module to re-sort it by different fields. But you probably don't, you have some text it output. The shell command `sort -nk3` will re-sort it numerically by column 3, which is close enough. In Vim you can highlight the output and type !sort -nk3, while in Emacs it's M-| sort -nk3.<p>• You can probably speed up a pure Python program by a factor of 10 with Numpy or Pandas. If it's not a numerical algorithm, it may not be obvious how, but it's usually feasible. It requires sort of turning the whole problem sideways in your mind. You may not appreciate the effort when you are attempting to modify the code.<p>• The _sre module is blazingly fast for finite state machines over Unicode character streams. It can be worth it to transmogrify your problem into a regular expression if you can.<p>• PyPy is probably faster. Use it if you can.<p>• The standard advice is to rewrite your hotspots in C once you've found them. Maybe this should be updated; Cython, Rust, and C++ are all reasonable alternatives, and for invoking the C etc., you have available cffi and ctypes now. In Jython this is all much simpler because you can easily invoke code in Java, Kotlin, or Clojure from Jython. An underappreciated aspect of this is that using native code can save you a lot of memory as well as instructions, and that may be more important. Consider trying __slots__ first if you suspect this may be the case.