Isn't this just a good argument for event-based programming ala Twisted or Node.js, to avoid the system thread overheads altogether?<p>It seems to me that programming with event-based frameworks like Node.js is much less fraught with peril, confusion, and error.<p>(Having written a couple of small but complete (bare iron) real-time, thread/process-based operating systems (and applications for them) for workstation-class CPUs back in the 80's, I'm highly aware of the peril and confusion possible. ;-)
Attempting the slide 3 example on a dual-core Core 2 Duo CPU running 32-bit Linux with Python 2.6 I get<p><pre><code> * single-threaded: 8.9s
* two threads: 10.4s
</code></pre>
Overhead is ~17% (compare to 2X slowdown reported on quad-core Mac OS X). Interesting.
I was fortunate enough to attend this talk. It was quite an eye-opener as far as how wonky thread performance is. The overall equation seems to be something like:<p><pre><code> # of threads * # of cores == context switch storm density.
</code></pre>
David was clear to say that his presentation was not to discourage people from using threads.<p>The take-away for me are to really pay attention to implementation of my thread usage and to see if linux processor affinity would help in a situation where you control all the threads in your app.<p>The upcoming changes to the GIL for 3.2 has a dramatic impact on stability of behavior in threading but a negative impact on IO threads, which they are planning on addressing.