This topic comes up every now and then. I thought this post was particularly insightful,<p>"One thing to keep in mind when looking at GNU programs is that they're often intentionally written in an odd style to remove all questions of Unix copyright infringement at the time that they were written.<p>The long-standing advice when writing GNU utilities used to be that if the program you were replacing was optimized for minimizing CPU use, write yours to minimize memory use, or vice-versa. Or in this case, if the program was optimized for simplicity, optimize for throughput.<p>It would have been very easy for the nascent GNU project to unintentionally produce a line-by-line equivalent of BSD yes.c, which would have potentially landed them in the 80/90s equivalent of the Google v.s. Oracle case."<p><a href="https://news.ycombinator.com/item?id=14543640" rel="nofollow">https://news.ycombinator.com/item?id=14543640</a>
That last rust example is less readable than modern template-metaprogramming variants of C++.<p>There is something elegant about being able to beat it out on speed with about 30 lines of C
The GNU variant was discussed recently at:
<a href="https://news.ycombinator.com/item?id=14542938" rel="nofollow">https://news.ycombinator.com/item?id=14542938</a><p>The commit that sped up GNU yes has a summary of the perf measurements:
<a href="https://github.com/coreutils/coreutils/commit/3521722" rel="nofollow">https://github.com/coreutils/coreutils/commit/3521722</a><p>yes can be used to generate arbitrary repeated data for testing or whatever, so it is useful to be fast
Really? You didn’t even mention reducing system calls? That’s basically what full_write does: try to output a whole buffer with one system call.<p>In your regular program, even with just a normal setvbuf call to set up block buffering would make a huge difference.
The author says "no magic here" for the C version:<p><pre><code> for (;;)
printf("%s\n", argc>1? argv[1]: "y");
</code></pre>
but it's not totally obvious to me whether the argument to printf would be evaluated on every iteration of the for loop or not. Does the compiler know that those don't change, and is the answer to that question fairly basic C knowledge or not?
Can anybody explain to me what's this syntax? This is the first time i see anything like it, and i've been programming in C since i was a teenager.<p><pre><code> main(argc, argv)
char **argv;
{
}</code></pre>
all the reddit users from the linked article missed the SIGPIPE trick!<p>you don't need to check the return value from write() as your process will be terminated with SIGPIPE if it tries writing to a closed pipe.<p>saying that, none of them check the return code correctly: if the consumer only reads a byte at a time you could eventually get 'yyyyyyyyyyyyyyyyyy' (without any newlines)<p>quite impressive that so many implementations of "yes" have the same bug :)
Not impressive at all.
Basically he had to write a lot of manual buffering code to reach GNU yes throughput.
I would suggest to use an infrastructure which already provides proper IO.<p>e.g. perl or clisp.<p><pre><code> $ perl -C0 -e'print "y\n" x (1024*8) while 1' | pv > /dev/null
^C.4GiB 0:00:11 [6.17GiB/s] [ <=> ]
$ yes | pv > /dev/null
^C.3GiB 0:00:07 [6.64GiB/s] [ <=>
</code></pre>
And with linux-only vmsplice the record stands at 123GB/s <a href="https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes_so_fast/diua761/" rel="nofollow">https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes...</a>
I guess a stretch goal would be to make a "shouldi" command that can consume more y's per second than yes can produce. Of course at that point the shell itself would probably become the bottleneck.
<p><pre><code> main(argc, argv)
char **argv;
{
for (;;)
printf("%s\n", argc>1? argv[1]: "y");
}
</code></pre>
Is beautiful, readable, and minimal. The "optimized" Rust version is complicated and over 50 lines of code. At what point does performance optimization go too far?
Funny, I just learned about this command a couple of days ago as a simple way to max out your CPU. I was trying to drain the battery on my Macbook Pro and running 4 of these at the same time did the trick nicely. Redirected to /dev/null and run in the background: "yes > /dev/null &"
I use the yes command to defrost my lunch. I open up a couple of tabs running<p>yes > /dev/null<p>Then place my frozen lunch on the back of my macbook. Give it an hour or so and boom, defrosted.
maybe it's supposed to be slow, wouldn't a faster 'yes' spam stdin much faster? you only need to hit yes occasionally and faster than a second
<p><pre><code> env::args().nth(1).unwrap_or("y".into());
</code></pre>
this ridiculously complicated syntax to perform such a simple thing is why I will never accept Rust. What a clumsy, ugly language. I’ll just stick with learning ANSI common LISP, that pays immediate dividends.