TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Parallel Programming with Python

218 点作者 uaaa将近 7 年前

16 条评论

quietbritishjim将近 7 年前
In response to the multiple comments here complaining that multithreading is impossible in Python without using multiple processes, because of the GIL (global interpreter lock):<p>This is just not true, because C extension modules (i.e. libraries written to be used from Python but whose implementations are written in C) can release the global interpreter lock while inside a function call. Examples of these include numpy, scipy, pandas and tensorflow, and there are many others. Most Python processes that are doing CPU-intensive computation spend relatively little time actually executing Python, and are really just coordinating the C libraries (e.g. &quot;mutiply these two matrices together&quot;).<p>The GIL is also released during IO operations like writing to a file or waiting for a subprocess to finish or send data down its pipe. So in most practical situations where you have a performance-critical application written in Python (or more precisely, the top layer is written in Python), multithreading works fine.<p>If you are doing CPU intensive work in pure Python and you find things are unacceptably slow, then the simplest way to boost performance (and probably simplify your code) is to rewrite chunks of your code in terms of these C extension modules. If you can&#x27;t do this for some reason then you will have to throw in the Python towel and re-write some or all of your code in a natively compiled language (if it&#x27;s just a small fraction of your code then Cython is a good option). But this is the best course of action regardless of the threads situation, because pure Python code runs orders of magnitude slower than native code.
评论 #17799466 未加载
评论 #17798930 未加载
评论 #17799044 未加载
评论 #17804506 未加载
评论 #17799625 未加载
评论 #17799772 未加载
elcombato将近 7 年前
&gt; (note that you must be using Python 2 for this workshop and not using Python 3. Complete this workshop using Python 2, then read about the small changes if you are interested in using Python 3)<p>Why using legacy Python for this?
评论 #17798442 未加载
评论 #17805042 未加载
ilovetux将近 7 年前
I find it strange that nobody ever seems to mention python&#x27;s concurrent.futures module [0] which is new in Python 3.2. I think asyncio got a lot of attention when it came out in Python 3.4 and concurrent.futures took a back seat. This article also doesn&#x27;t mention the module in it&#x27;s Python 2 and 3 differences link.<p>asyncio is a good library for asyncronous I&#x2F;O but concurrent.futures gives us some pretty nifty tooling which makes concurrent programming (with ThreadPoolExecutor) and parallel programming (with ProcessPoolExecutor) pretty easy to get right. The Future class is a pretty elegant solution for continuing execution while a background task is being executed.<p>[0] <a href="https:&#x2F;&#x2F;docs.python.org&#x2F;3&#x2F;library&#x2F;concurrent.futures.html" rel="nofollow">https:&#x2F;&#x2F;docs.python.org&#x2F;3&#x2F;library&#x2F;concurrent.futures.html</a>
评论 #17799340 未加载
mpweiher将近 7 年前
&quot;...take advantage of the processing power of multicore processors&quot;<p>Step 1: stop using Python.<p>&quot;You can have a second core when you know how to use one&quot;<p>Now don&#x27;t get me wrong, Python is a perfectly fine language for lots of things, but not for taking optimal advantage of the CPU.<p><a href="https:&#x2F;&#x2F;benchmarksgame-team.pages.debian.net&#x2F;benchmarksgame&#x2F;faster&#x2F;python3-gcc.html" rel="nofollow">https:&#x2F;&#x2F;benchmarksgame-team.pages.debian.net&#x2F;benchmarksgame&#x2F;...</a><p>Relative performance compared to C is somewhere between an order of magnitude or two slower. Considering how much harder and more error-prone multi-core is, maybe first try a fast sequential solution.
评论 #17799500 未加载
评论 #17798531 未加载
评论 #17802629 未加载
评论 #17800091 未加载
ram_rar将近 7 年前
I love python. But its seriously, incapable for doing non trivial concurrent tasks. Multiprocessing module doesnt count. I hope the python core-devs take some inspiration from golang for developing the right abstractions for concurrency.
评论 #17798707 未加载
评论 #17798987 未加载
评论 #17798738 未加载
评论 #17799732 未加载
评论 #17802183 未加载
评论 #17798648 未加载
andbberger将近 7 年前
IMO ray[1] is the greatest thing to happen in python parallelism since the invention of sliced bread.<p>Also includes best currently available hyperparameter tuning framework!<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;ray-project&#x2F;ray" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ray-project&#x2F;ray</a>
another-cuppa将近 7 年前
I think a lot of this complexity can be avoided by just writing single threaded python and using GNU parallel for running it on multiple cores. You can even trivially distribute the work across a cluster that way.
评论 #17799799 未加载
jillesvangurp将近 7 年前
Did they ever fix the global interpreter lock? Sort of a show stopper with doing stuff concurrently in python. I&#x27;ve done a bit of batch processing using the multi process module; which uses processes instead of threads. This works but it is a bit of a kludge if you are used to languages that support concurrency properly.
评论 #17798540 未加载
评论 #17798418 未加载
评论 #17798458 未加载
mwyau将近 7 年前
mpi4py should be included. It&#x27;s a wrapper for the MPI library, which is the de facto standard for scientific computing: <a href="https:&#x2F;&#x2F;mpi4py.readthedocs.io&#x2F;en&#x2F;stable&#x2F;" rel="nofollow">https:&#x2F;&#x2F;mpi4py.readthedocs.io&#x2F;en&#x2F;stable&#x2F;</a>
natvert将近 7 年前
Sweet, a guide! I always end up rolling my own thread pool &#x2F; manager. I wish something like the parallel gem for Ruby existed in pyland...
评论 #17798782 未加载
magwa101将近 7 年前
Concurrency in python always ends up the reason to drop it and reimplement in Go. Also, the code ends up littered with type checks....
wenning将近 7 年前
i think use python3 multiprocess and async is better for product.
gnufx将近 7 年前
Multi-core parallelism isn&#x27;t so interesting for serious computation. You want to be able to use large distributed HPC systems, but Python doesn&#x27;t seem to have the equivalent of <a href="https:&#x2F;&#x2F;pbdr.org" rel="nofollow">https:&#x2F;&#x2F;pbdr.org</a> for R.
kilon将近 7 年前
One more epic discussion on Python, where we have the unique opportunity to learn that using C libraries from Python is &quot;cheating&quot;.<p>I could not agree more<p>It&#x27;s definitely cheating to use C code with the exception of most Python libraries that already are to a large extent nothing more than thin wrappers over existing C libraries or the tiny fact that the most popular by far implementation of Python , CPython, is almost 50% implemented in the C language, including the standard library.The author even dared include &quot;C&quot; in the name of the implementation.<p>Those cheaters, becoming bolder and bolder every day.<p>Damn them !!!
评论 #17800635 未加载
goerz将近 7 年前
The GIL has considerable benefits: I don’t have to worry about whether Python functions are thread-safe. Thread-based parallelism is hard to get right, and given the number of workarounds, Python’s GIL is a total non-issue.
评论 #17800499 未加载
评论 #17800937 未加载
walterstucco将近 7 年前
&gt; Parallel Programming with Python?<p>What about no?<p>Don&#x27;t get me wrong, i don&#x27;t like Python as a language, but it&#x27;s a fine tool and many useful programs have been written with it<p>But parallel programming? No, thanks.