Making Python Programs Blazingly Fast

223 pointsby giladover 5 years ago

33 comments

andrepdover 5 years ago

This is a terrible article. How this has gotten this kind of traction is unexplainable to me.>This is the program I will be using for demonstration purposes>Never comes up again for the rest of the post.???>Let's show you how to profile code.>Also, here's a bunch of unprofiled suggestions with such precise and helpful comments as "slow" and "fast".?????>Python haters always say, that one of reasons they don't want to use it, is that it's slow. Well, whether specific program - regardless of programming language used - is fast or slow is very much dependant on developer who wrote it and their skill and ability to write optimized and fast programs.This is so ridiculous it's honestly laughable. It's such an obvious falsehood that the only explanations is either the person is truly this clueless, or else they are wilfully spewing bullshit. A bare metal language like C/C++ will of course let you do things faster than a heavy dynamic language like Python.The mental gymnastics people do to justify not learning another tool. You know what they say, if all you have is a hammer, everything looks like a nail.>First rule of optimization is to not do it.If this person is representative, this explains why computers are hundreds of times faster but most software feels slower than in 1999.

评论 #22045250 未加载

评论 #22044903 未加载

评论 #22045198 未加载

评论 #22045229 未加载

评论 #22046151 未加载

nayukiover 5 years ago

The title is bad; the article doesn't deliver on the promise of making Python programs "blazingly" fast.The first example given (the exponential function) is basically the worst scenario, because it's a purely numerical computation expressed in pure Python code. Whereas Python's performance is okay-ish for I/O or calling C modules.From doing Project Euler solutions, I have ample evidence that for pure numerics (e.g. int, float, array), Java is anywhere from 10× to 30× faster than pure Python code executed in CPython. <a href="https://www.nayuki.io/page/project-euler-solutions#benchmark-timings" rel="nofollow">https://www.nayuki.io/page/project-euler-solutions#benchmark...</a>I believe it is basically impossible for Python to win back all that performance loss without adopting radical and jarring features like static typing, machine-sized integers, and no more "every number is a full-fledged object".

评论 #22043299 未加载

评论 #22042623 未加载

评论 #22042960 未加载

评论 #22042319 未加载

评论 #22043810 未加载

评论 #22044147 未加载

zmmmmmover 5 years ago

> So, let's prove some people wrong and let's see how we can improve performance of our Python programs and make them really fast!I have to say, the desperate lengths Python programmers will go to to use it for things it was not meant for rather than learn or use other languages is one of the aspects I most dislike about it. However fast you make it, the same effort would have made it that much faster again in a performant language.

评论 #22042657 未加载

评论 #22043201 未加载

评论 #22043255 未加载

评论 #22042726 未加载

评论 #22043481 未加载

kaslaiover 5 years ago

> First rule of optimization is to not do it.This is an unfortunately common misunderstanding of the phrase: "premature optimization is the root of all evil."Optimization is a crucial part of developing successful software. It can be harmful to get overzealous with certain types of optimization, however basic wins like using string builder primitives or formatted strings from the outset is hardly premature. Some optimizations can only be realized at the early conceptual stages too; going for those early on isn't always premature.

评论 #22041937 未加载

评论 #22042053 未加载

评论 #22042874 未加载

评论 #22044178 未加载

the_jeremyover 5 years ago

None of the performance tuning suggestions are benchmarked, and I find it hard to believe these would ever make a substantial difference. They could make a statistically significant difference, maybe, but local variables vs class attributes? You should show how much of a time saver this is, because I can't envision a realistic scenario where this is worth the developer time.

评论 #22042250 未加载

评论 #22041836 未加载

评论 #22043447 未加载

评论 #22044005 未加载

benfredericksonover 5 years ago

Interesting article. While I definitely think you should be profiling your code to figure out the hot spots, cProfile has some limitations for profiling: cProfile doesn't give you line numbers, doesn’t work with threads, and significantly slows your program down.I wrote a tool py-spy (<a href="https://github.com/benfred/py-spy" rel="nofollow">https://github.com/benfred/py-spy</a>) that is worth checking out if you’re interesting in profiling python programs. Not only does it solve those problems with cProfile - py-spy also lets you generate a flamegraph, profile running programs in production, works with multiprocess python applications, can profile native python extensions etc.

评论 #22042708 未加载

评论 #22044153 未加载

mbeexover 5 years ago

> Python haters always sayStopped here immediately. I have been writing software for more than 20 years, mainly in C++ and Python. No professional would start this kind of discussion with this childish attitude (apart from the fact, that content-wise the problem was beaten to death for decades).

评论 #22042985 未加载

d--bover 5 years ago

Ugh, the whole “python is slow - but it’s great for piping C libraries” trade off has been discussed a gazillion times before.This article is written by someone who obviously doesn’t know much about CS.Please HN community, try to not upvote these, it’s a waste of time for all of us.

adrianNover 5 years ago

The only Python programs that can be called "blazingly" fast compared to equivalent programs in performant languages are either spending all their time in I/O, or all spending all their time in C. Python is a nice language and with some tricks you might speed it up by a factor 2-10, but writing the same program in, say, Java, will often be 50-100x faster.

评论 #22042704 未加载

评论 #22043285 未加载

kragenover 5 years ago

The article has some embarrassing errors, and its advice is not going to make your Python programs blazingly fast, but it's a good start.Resuming a generator in CPython is a lot faster than creating a whole new function call, and especially a whole new method call, contrary to what the article said. But often enough it's faster to just eagerly materialize a list result.Some other good tips: %timeit, ^C, sort -nk3, Numpy, Pandas, _sre, PyPy, native code. In more detail:• For benchmarking, use %timeit in IPython. It's much easier and much more precise than time(1). For super lazy benchmarking use %%time instead.• The laziest profiler is to interrupt your program with ^C. If you do this twice and get the same stack trace, it's a good bet that's where your hotspot is. cProfile is better, at least for single-threaded programs. Others here suggest line_profiler.• If you have output from the profile or cProfile module saved in a file, you can use the pstats module to re-sort it by different fields. But you probably don't, you have some text it output. The shell command `sort -nk3` will re-sort it numerically by column 3, which is close enough. In Vim you can highlight the output and type !sort -nk3, while in Emacs it's M-| sort -nk3.• You can probably speed up a pure Python program by a factor of 10 with Numpy or Pandas. If it's not a numerical algorithm, it may not be obvious how, but it's usually feasible. It requires sort of turning the whole problem sideways in your mind. You may not appreciate the effort when you are attempting to modify the code.• The _sre module is blazingly fast for finite state machines over Unicode character streams. It can be worth it to transmogrify your problem into a regular expression if you can.• PyPy is probably faster. Use it if you can.• The standard advice is to rewrite your hotspots in C once you've found them. Maybe this should be updated; Cython, Rust, and C++ are all reasonable alternatives, and for invoking the C etc., you have available cffi and ctypes now. In Jython this is all much simpler because you can easily invoke code in Java, Kotlin, or Clojure from Jython. An underappreciated aspect of this is that using native code can save you a lot of memory as well as instructions, and that may be more important. Consider trying __slots__ first if you suspect this may be the case.

评论 #22042795 未加载

评论 #22046843 未加载

j88439h84over 5 years ago

These are all trivial micro-optimizations.“If you want your code to run faster, you should probably just use PyPy.” — Guido van Rossum<a href="https://pypy.org/" rel="nofollow">https://pypy.org/</a>

评论 #22041897 未加载

评论 #22042021 未加载

eesmithover 5 years ago

Fourth time this has been posted in 12 days. My comment from 12 days ago is at <a href="https://news.ycombinator.com/item?id=21930569" rel="nofollow">https://news.ycombinator.com/item?id=21930569</a> . I pointed out that kernprof profiling shows that 99+% of the time is spent in<pre><code> s += num / fact </code></pre> so none of the techniques describe give blinding speedup. I also suggest pre-compiling the regex.

评论 #22044834 未加载

评论 #22044669 未加载

jonstewartover 5 years ago

I regret reading this article and I think the title is clickbait. I was hoping for something like PyPy or Unladen Swallow, etc. The equivalent programs in TFA will be blazingly faster if ported simply to other languages.

评论 #22042048 未加载

drdaemanover 5 years ago

> Don't Access Attributes (example `import re; re.findall(...)` vs `from re import findall; findall(...)`I find it a good habit to always import modules and almost never (sane exclusions apply) import individual functions from them. If I use something frequently, I'd alias it for clarity (`import sqlalchemy as sa`)The reason is that otherwise, patching with mocks becomes somewhat tricky, as you'll have to patch functions in each individual importer module separately. Here's an example: <a href="https://stackoverflow.com/a/16134754/116546" rel="nofollow">https://stackoverflow.com/a/16134754/116546</a>Maybe that's wrong but my idea is that I don't want to assume which module calls some specific function but just mock the thing (e.g. make sure Stripe API returns a mock subscription - no matter where exactly it's called from). Then, if I refactor things and move a piece of code around (e.g. extract working with Stripe to a helper module), my unit tests just continue to work.---> Based on recent tweet from Raymond Hettinger, the only thing we should be using is f-string, it's most readable, concise AND the fastest method.I love f-strings, but to best of my knowledge, one can't use f-strings for i18n/l10n, so all end-user-facing texts still have to use `%` or `format`. E.g. `_("Hello, %(name)").format(name=name)`.

评论 #22041839 未加载

smabieover 5 years ago

A 30% speed-up in Python is still dog-slow. This is a terrible article, he doesn't even talk about his "example." it's like he gave up 1/10th of the way through the post.

kashugover 5 years ago

The article does not seems to work for me. I only get "undefined" as contewnt. Looking at the network-debugger in Firefox the call to load article seems to be blocked due to CORS. (it tries to do a call on port 1234 for some reason)

luordover 5 years ago

Just read the three top comments and their threads. There was absolutely no meaningful discussion or worthwhile contributions in any of them, just fans of less popular languages mostly venting their resent.The weirdest thing is that they aren't even using python nor it seems that they're being forced to use it currently, making all this... Ranting (there's literally no other word for this) all the more inexplicable.I don't understand it; I've been using Go for a year now at work. I hate pretty much everything about it, yet I haven't ranted about it in an article about the language for about that time. There's just no point to it.

评论 #22168360 未加载

ezzzzzover 5 years ago

Anybody with experience able to chime in on a question? So, at a high-level, I am looking at using Python at my workplace. We are a weird amalgamation of a Java and Microsoft shop, using Java and Kotlin for 'critical' systems, while heavily relying on SQL Server/SSIS/SSRS for all our back-office processing (batch jobs, reporting, ETL etc). This is the stuff my team is responsible for, and we are constantly hitting the limitations of this stack. My feeling is that Python brings enough to the table as a general purpose language to be a good fit for our use-cases. Simple automation of file io, analytics and reporting, small footprint web frameworks (flask), big data tools like Spark, libraries like Pandas, PyTorch etc. Also, I don't have time to learn idiomatic Scala. It's not about laziness, its just that I feel Python brings enough to the table to be useful, while still being productive and readable. Then I read threads like this and start second-guessing myself. I see some red-flags for sure, but I'm just looking for some validation here. Basically, we have a lot that needs fixing, we need to do it quickly, and I'm wondering if Python can work. We are certainly in the realm of 'big-data', and are currently handling everything with procedural SQL, some Java apps that need refactoring, Perl scripts and scheduled tasks on Win Server, and a bloated, poorly implemented Java Web App to provide a front-end to our poorly maintained, non-normalized database.

评论 #22047612 未加载

imtringuedover 5 years ago

I personally dislike the use of caching to increase performance. It is very easy to slap on caching and then the benchmarks say the problem is fixed but you will end up with unpredictability. You can no longer know how much memory your program is using and you don't know if a given function call is the source of a bottleneck or not. Your profiler will show a single hot function when the cache is empty but all the other calls that happen after caching become invisible.

Alex3917over 5 years ago

There are some interesting things in here I wasn't aware of. That being said, you should really be timing individual functions by using line_profiler, otherwise even if you find a slow function you won't have any idea what part is making it slow. Often it's extremely counter intuitive. E.g. compiling regular expressions can be hundreds of times slower than executing them.

sbr464over 5 years ago

I’m currently working on a lib that allows choosing the best implementation of a method based on the current browser/os.Performance varies wildly for basic coding decisions across platforms. Especially diff combinations of browser + os.Im deciding on a name still, was thinking concepts like ‘popular’ from the song by Nada Surf, or photo finish (horse racing), or something like unfortunate/wheel of unfortune, poking fun at the need to have this lib.Here's a messy example that shows this issue (try it in diff browsers).<a href="http://jsben.ch/Uzj2Q" rel="nofollow">http://jsben.ch/Uzj2Q</a>

daenzover 5 years ago

>Generators are not inherently faster as they were made to allow for lazy computation, which saves memory rather than time. However, the saved memory can be cause for your program to actually run faster. How? Well, if you have large dataset and you don't use generators (iterators), then the data might overflow CPUs L1 cache, which will slow down lookup of values in memory significantly.Can someone chime in about the L1 cache? The claim is made without measurements, so I am skeptical.

im3w1lover 5 years ago

I think you'd be better of offloading the hot part to C++.

ascotanover 5 years ago

<a href="https://dev.to/martinheinz/making-python-programs-blazing-fast-4knl" rel="nofollow">https://dev.to/martinheinz/making-python-programs-blazing-fa...</a>Honestly the quality bar for most things written is python is pretty low so anything that can help people improve is fine. So kudos to the author.

gridlockdover 5 years ago

The only way to make Python programs "blazingly" fast is to not use the Python interpreter at all in the hot path.Almost everything the Python interpreter does is ridiculously slow, even for an interpreted language. The language design[1] prevents fast implementations[2].[1] Restricted subsets of Python do not count[2] No, PyPy is not fast. It is slow, even for a JIT.

评论 #22049397 未加载

JelteFover 5 years ago

Interesting post, but the code examples are completely unreadable on firefox + windows, because of the CSS color: #333 on .hljs class.<a href="https://imgur.com/tUnlwWa" rel="nofollow">https://imgur.com/tUnlwWa</a>

qwerty456127over 5 years ago

> Use Functions. This might seem counter intuitive, as calling function will put more stuff onto stack and create overhead from function returns, but it relates to previous point. If you just put your whole code into one file without putting it into function, it will be much slower because of global variables. Therefore you can speed up your code just by wrapping whole code in main function and calling it once, like so.Wow, this is the one I couldn't expect. I always wrap the scripts in the main function out of pure perfectionism (or perhaps that's OCD) but the fact a script without it is going to run slower seems counter-intuitive and should really be among the first things taught.

评论 #22043767 未加载

uncle_jover 5 years ago

Some of these optimisations are very similar to what you used to do in JavaScript with slower JS engines. Caching a value in a variable name than constantly accessing a property.

armitronover 5 years ago

Blazingly fast? Not the words I would use..

commandersakiover 5 years ago

Write CPython with an emphasis on C. Then get the speed gains you need.

jokoonover 5 years ago

Aren't there some thumb rules for writing fast python code?

Jaxkrover 5 years ago

Surprised PyPy was not mentioned.

AzzieElbabover 5 years ago

never got pass the spinner. is this an insider joke? the spinner was pretty fast