Not sure if it was intentional, but the article is quite misleading.<p>The thread pool in node is only used for a limited number of APIs. Pretty much all networking uses native async IO and is unaffected by the size of the thread pool. Things like Oracle's driver are rare exceptions: the typical MySQL/PostgreSQL/redis etc drivers all use native async IO and are unaffected by this.<p>The author only glosses over this briefly. As a result this article leaves the impression that the problem described is the norm, which is not the case.
It's worth noting that DB drivers that actually integrate with libuv are a minority - most DB divers use Node-level network APIs and are unaffected by such thread pool limits.
I actually ran into an issue recently with CPU intensive tasks blocking my web server. It turns out that "querystring" (used to parse request bodies in web applications) is an asynchronous, blocking request. You'd never notice much slowness, until your request bodies are massive (think 50 nested JSON objects and some base64 image data for good measure) and you have multiple per second. Now, every request is blocked until the previous one is processed. I'm still trying to figure out a solution, after looking into worker threads, etc.
Thank you for sharing your findings here. Very pertinent to me right now - could actually drastically minimise the amount of research I have to do today. I love HN.
Is there a detailed overview about which functions in libuv (Nodejs) rely on blocking primitives and thus are using the thread pool to work async?<p>From [here](<a href="http://docs.libuv.org/en/latest/design.html" rel="nofollow">http://docs.libuv.org/en/latest/design.html</a>), it sounds like all file IO is always based on blocking primitives, and native async file IO primitives are not used, although such async file IO primitives do exists and were tried out in libtorrent (<a href="http://blog.libtorrent.org/2012/10/asynchronous-disk-io/" rel="nofollow">http://blog.libtorrent.org/2012/10/asynchronous-disk-io/</a>). The result of that experiment however was mostly that the thread pool solution was simpler to code (I guess).<p>From the libuv design doc, the overview is:<p>* Filesystem operations<p>* DNS functions (getaddrinfo and getnameinfo)<p>* User specified code via uv_queue_work()<p>I wonder whether this is really the best solution of if some combination of a thread pool and native async disk IO primitives could perform better.
Is this another case of "here's the code I ran" when in fact they didn't? There should be 3 lines of output, not 6!<p>Also, the code says it will print the time taken since the start of the program, which again doesn't go with the output and the conclusion being made!<p>Anyway, how come the output isn't in order?
Use named functions to avoid closures!<p><pre><code> for (var i = 0; i < 3; ++i) {
namedFunction(i);
}
function namedFunction(id) {
fs.readdir('.', function () {
var end = process.hrtime(start);
console.log(util.format('readdir %d finished in %ds', id, end[0] + end[1] / 1e9));
});
};</code></pre>
Does it mean the async functions about file system operations are not really asynchronous in Node.js? The whole node.js server will be blocked if the number of file system operations is bigger than the thread pool size. :-(
(noob question) why would the libuv threadpool choose to use a static 4 instead of something like matching the number of processor cores available by default?
To sum it up: node.js runs only 4 background threads. Be aware of this.<p>No big deal really. I have multiple servers running node.js under load for 3+ years and never had an issue with this.<p>In fact, I found it helpful. If your database is under load and is already running 4 heavy queries, not giving it any more jobs is actually a good thing.