TechEcho

10 comments

bquinlanabout 11 years ago

I'd like to point out that the Python standard library offers an abstraction over threads and processes that simplifies the kind of concurrent work described in the article: <a href="https://docs.python.org/dev/library/concurrent.futures.html" rel="nofollow">https://docs.python.org/dev/library/concurrent.futures.html</a>You can write the threaded example as:<pre><code> import concurrent.futures import itertools import random def generate_random(count): return [random.random() for _ in range(count)] if __name__ == "__main__": with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor: executor.submit(generate_random, 10000000) executor.submit(generate_random, 10000000) # I guess we don't care about the results... </code></pre> Changing this to use multiple processes instead of multiple threads is just a matter of s/ThreadPoolExecutor/ProcessPoolExecutor.You can also write this more idiomatically (and collect the combined results) as:<pre><code> if __name__ == "__main__": with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor: out_list = list( executor.map(lambda _: random.random(), range(20000000))) </code></pre> In this example case, this will be quite a bit slower because the work item (in this case generating a single random number) is trivial compared to the overhead of maintaining a work queue of 200000000 items - but in a more typical case where the work takes more than a millisecond then it is better to let the executor manage the division of labour.

评论 #7692048 未加载

评论 #7691480 未加载

评论 #7692186 未加载

halayliabout 11 years ago

This example is not too realistic and just narrows it down to the case where a job can be divided into isolated tasks with no shared data/state.Often times threads need to update shared dict/list etc... With multiprocessing this cannot be done. You can use a Queue for this but it's horribly inefficient.Generally speaking if you need performance and Python is not meeting the requirements then you are better off using another language.

评论 #7691948 未加载

评论 #7691497 未加载

评论 #7691751 未加载

tiger10guyabout 11 years ago

For the every day when I want to make embarrassingly parallel operations in Python go fast I find joblib to be a pretty good solution. It doesn't work for everything, but it's quick and simple where it does work.<a href="https://pythonhosted.org/joblib/" rel="nofollow">https://pythonhosted.org/joblib/</a>

评论 #7691504 未加载

zo1about 11 years ago

I've had good success using Celery to parallelize tasks/jobs in python.www.celeryproject.orgAlso, it has a very nice concept called canvas that allows you to chain/combine the data/results of different tasks together.It also allows you to switch out different implementations of the communication infrastructure that Celery uses to communicate and dish-out tasks.

dekhnabout 11 years ago

For python developers who dislike the continued existence of the GIL in a multicore world, and who feel that multiprocessing is a poor response given the existence proofs of IronPython and Jython as non-GIL interpreter implementations, please consider moving to Julia.Julia addresses nearly all the problems I've found with Python over the years, including poor performance, poor threading support on multicore machines, integration with C libraries, etc. I was a big adherent of Python but as machines got more capable, the ongoing resistence to solving the GIL problem (which IronPython demonstrated can be done with reasonable impact on serial performance) I could not continue using the language except for legacy applications.

评论 #7692123 未加载

评论 #7692240 未加载

wbsunabout 11 years ago

Python threads, aren't they just single threaded execution?!

评论 #7692321 未加载

评论 #7692660 未加载

matrixiseabout 11 years ago

Have you seen there is an error in the code for the threading part?the right way if you want to use a thread is thread = threading.Thread(target=CALLABLE, args=ARGS)and notthread = threading.Thread(target=CALLABLE(ARGS))

CraigJPerryabout 11 years ago

For the example task we could use the multiprocessing Pool and (the undocumented) ThreadPool.This implements the worker pool logic already so we don't have to.

评论 #7691490 未加载

thikonomabout 11 years ago

For network bound operations Twisted's cooperate / coiterate come handy.

eudoxabout 11 years ago

Or just use a better language. One is that is actually compiled and fast?

评论 #7691991 未加载

10 comments

bquinlanabout 11 years ago

评论 #7692048 未加载

评论 #7691480 未加载

评论 #7692186 未加载

halayliabout 11 years ago

评论 #7691948 未加载

评论 #7691497 未加载

评论 #7691751 未加载

tiger10guyabout 11 years ago

评论 #7691504 未加载

zo1about 11 years ago

dekhnabout 11 years ago

评论 #7692123 未加载

评论 #7692240 未加载

wbsunabout 11 years ago

Python threads, aren't they just single threaded execution?!

评论 #7692321 未加载

评论 #7692660 未加载

matrixiseabout 11 years ago

CraigJPerryabout 11 years ago

For the example task we could use the multiprocessing Pool and (the undocumented) ThreadPool.This implements the worker pool logic already so we don't have to.

评论 #7691490 未加载

thikonomabout 11 years ago

For network bound operations Twisted's cooperate / coiterate come handy.

eudoxabout 11 years ago

Or just use a better language. One is that is actually compiled and fast?

评论 #7691991 未加载

Parallelising Python with Threading and Multiprocessing

10 comments

Parallelising Python with Threading and Multiprocessing

10 comments