This is a well-written article with excellent explanations and I thoroughly enjoyed it.<p>However, none of the variants using vmsplice (i.e., all but the slowest) are safe. When you gift [1] pages to the kernel there is no reliable general purpose way to know when the pages are safe to reuse again.<p>This post (and the earlier FizzBuzz variant) try to get around this by assuming the pages are available again after "pipe size" bytes have been written after the gift, _but this is not true in general_. For example, the read side may also use splice-like calls to move the pages to another pipe or IO queue in zero-copy way so the lifetime of the page can extend beyond the original pipe.<p>This will show up as race conditions and spontaneously changing data where a downstream consumer sees the page suddenly change as it it overwritten by the original process.<p>The author of these splice methods, Jens Axboe, had proposed a mechanism which enabled you to determine when it was safe to reuse the page, but as far as I know nothing was ever merged. So the scenarios where you can use this are limited to those where you control both ends of the pipe and can be sure of the exact page lifetime.<p>---<p>[1] Specifically, using SPLICE_F_GIFT.
I once had to change my mental model for how fast some of these things were. I was using `seq` as an input for something else, and my thinking was along the lines that it is a small generator program running hot in the cpu and would be super quick. Specifically because it would only be writing things out to memory for the next program to consume, not reading anything in.<p>But that was way off and `seq` turned out to be ridiculously slow. I dug down a little and made a faster version of `seq`, that kind of got me what I wanted. But then noticed at the end that the point was moot anyway, because just piping it to the next program over the command line was going to be the slow point, so it didn't matter anyway.<p><a href="https://github.com/tverniquet/hseq" rel="nofollow">https://github.com/tverniquet/hseq</a>
Ran the basic initial implementation on my Mac Studio and was pleasantly surprised to see<p><pre><code> @elysium pipetest % pipetest | pv > /dev/null
102GiB 0:00:13 [8.00GiB/s]
@elysium ~ % pv < /dev/zero > /dev/null
143GiB 0:00:04 [36.4GiB/s]
</code></pre>
Not a valid comparison between the two machines because I don't know what the original machine is, but MacOS rarely comes out shining in this sort of comparison, and the simplistic approach here giving 8 GB/s rather than the author's 3.5 GB/s was better than I'd expected, even given the machine I'm using.
Netmap offers zero-copy pipes (included in FreeBSD, on Linux it's a third party module): <a href="https://www.freebsd.org/cgi/man.cgi?query=netmap&sektion=4" rel="nofollow">https://www.freebsd.org/cgi/man.cgi?query=netmap&sektion=4</a>
The majority of this overhead (and the slow transfers) naively seem to be in the scripts/systems using the pipes.<p>I was worried when I saw zfs send/receive used pipes for instance because of performance worries - but using it in reality I had no problems pushing 800MB/s+. It seemed limited by iop/s on my local disk arrays, not any limits in pipe performance.
For some reason, this raised my curiosity how fast different languages write individual characters to a pipe:<p>PHP comes in at about 900KiB/s:<p><pre><code> php -r 'while (1) echo 1;' | pv > /dev/null
</code></pre>
Python is about 50% faster at about 1.5MiB/s:<p><pre><code> python3 -c 'while (1): print (1, end="")' | pv > /dev/null
</code></pre>
Javascript is slowest at around 200KiB/s:<p><pre><code> node -e 'while (1) process.stdout.write("1");' | pv > /dev/null
</code></pre>
What's also interesting is that node crashes after about a minute:<p><pre><code> FATAL ERROR: Ineffective mark-compacts
near heap limit Allocation failed -
JavaScript heap out of memory
</code></pre>
All results from within a Debian 10 docker container with the default repo versions of PHP, Python and Node.<p>Update:<p>Checking with strace shows that Python caches the output:<p><pre><code> strace python3 -c 'while (1): print (1, end="")' | pv > /dev/null
</code></pre>
Outputs a series of:<p><pre><code> write(1, "11111111111111111111111111111111"..., 8193) = 8193
</code></pre>
PHP and JS do not.<p>So the Python equivalent would be:<p><pre><code> python3 -c 'while (1): print (1, end="", flush=True)' | pv > /dev/null
</code></pre>
Which makes it compareable to the speed of JS.<p>Interesting, that PHP is over 4x faster than the Python and JS.
Android's flavor of Linux uses "binder" instead of pipes because of its security model. IMHO filesystem-based IPC mechanisms (notably pipes), can't be used because of a lack of a world-writable directory - i may be wrong here.<p>Binder comes from Palm actually (OpenBinder)
I've dumped pixels and pcm audio through a pipe, it certainly was fast enough for that <a href="https://git.cloudef.pw/glcapture.git/tree/glcapture.c" rel="nofollow">https://git.cloudef.pw/glcapture.git/tree/glcapture.c</a> (I suggest gamescope + pipewire to do this instead nowadays however)
I usually just use cat /dev/urandom > /dev/null to generate load. Not sure how this compares to their code.<p>Edit: it’s actually “yes” that I’ve used before for generating load. I remember reading somewhere “yes” was optimized differently than the original Unix command as part of the unix certification lawsuit(s).<p>Long night.
I'm glad huge pages make a big difference because I just spent several hours setting them up. Also everyone says to disable transparent_hugepage, so I set it to `madvise`, but I'm skeptical that any programs outside databases will actually use them.
Something maybe a bit related.<p>I just had 25Gb/s internet installed (<a href="https://www.init7.net/en/internet/fiber7/" rel="nofollow">https://www.init7.net/en/internet/fiber7/</a>), and at those speeds Chrome and Firefox (which is Chrome-based) pretty much die when using speedtest.net at around 10-12Gbps.<p>The symptoms are that the whole tab freezes, and the shown speed drops from those 10-12Gbps to <1Gbps and the page starts updating itself only every second or so.<p>IIRC Chrome-based browsers use some form of IPC with a separate networking process, which actually handles networking, I wonder if this might be the case that the local speed limit for socketpair/pipe under Linux was reached and that's why I'm seeing this.
pv is written in perl so isn't the snappiest, I'm surprised to see it score so highly. I wonder what the initial speed would have been if it just wrote to /dev/null
<i>Linux</i> pipes?<p>Oh yes, <i>Linux</i> pipes were invented by Douglas McIlroy while working for Bell Labs on Research UNIX and first described in the man pages of Version 3 Unix, Feb. 1974, just a couple months after Linus Torvald's 4th birthday.<p>Where and how and when will the unjust and blatent <i>plagiarism</i> of Linux cease? The software was made free by BSD, so feel free to use it, roll it all into GNU/Linux, have at it, but please stop incorrectly describing these things as <i>Linux</i> things. Because the only software that I am certain actually belongs to Linux is systemd. So let's start calling that "Linux systemd," and stop calling anything else Linux anything.