Some random thoughts:<p>• Though the first pipeline is didactic, it can be done entirely within awk:<p><pre><code> awk '
BEGIN { l=0 }
/purple/ {
if(length($1) >= l) { word = $1; l = length($1) }
}
END { print word }' < /usr/share/dict/words
</code></pre>
• Named pipes are neat, but you can also use subshells and additional FDs (I am in no way arguing this is more clear):<p><pre><code> (
(
(
echo out
echo err >&2
) | while read out; do echo "out: $out"; done >&3
) 2>&1 | while read err; do echo "err: $err"; done
) 3>&1
</code></pre>
• Bash has "set -o pipefail" for cases where you want any process in the pipeline that exits non-zero to cause the entire pipeline to exit non-zero.
Error detection is much easier with pipefail:<p><a href="http://www.gnu.org/software/bash/manual/html_node/Pipelines.html" rel="nofollow">http://www.gnu.org/software/bash/manual/html_node/Pipelines....</a><p>"If pipefail is enabled, the pipeline’s return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully"
If you like pipes, you'll love Pipe Viewer:<p><a href="http://www.ivarch.com/programs/pv.shtml" rel="nofollow">http://www.ivarch.com/programs/pv.shtml</a><p>> pv - Pipe Viewer - is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.
Another fun trick, by piping through dd you can add a buffer between processes.<p>Example: the raspberry pi has pretty slow SD performance and the USB bus can get hogged. If you record audio and want to encode + write it to SD you can easily get buffer overruns. Easily solved by a 10sec buffer between arecord and flac in my case.
Reminds me of the classic David Beazley course on coroutines: <a href="http://www.dabeaz.com/coroutines/" rel="nofollow">http://www.dabeaz.com/coroutines/</a><p>It highlights a similar pipeline-oriented architecture and eventually ends up being sort of mindblowing.
I hate to be "that guy" but <i>someone</i> has to say something about the Useless Use of Cat.<p><a href="http://www.catb.org/jargon/html/U/UUOC.html" rel="nofollow">http://www.catb.org/jargon/html/U/UUOC.html</a>
Off topic: Today I learned <a href="http://dcurt.is/unkudo" rel="nofollow">http://dcurt.is/unkudo</a>. Peter Sobot, I want my kudos back. (Not that I didn't really appreciate learning about ${PIPESTATUS[*]}.)
I’ve used pipes for years and had a pretty good understanding of how the system calls of each process in the pipeline interacts with their own `sdin` and `stdout` file descriptors but this article puts it all together really nicely with some good examples.<p>I don’t mind the useless use of `cat` as it can enhance readability for some people. However, I would suggest replacing the Bash while loop with a for loop:<p><pre><code> ls *.flac |
while read song
do
flac -d "$song" --stdout |
lame -V2 - "$song".mp3
done
for song in *.flac
do
flac -d "$song" --stdout |
lame -V2 - "$song".mp3
done
</code></pre>
Using Bash’s file globbing avoids problems with enumerating the output of `ls` [1]. It also avoids an unnecessary Bash sub-shell being spawned for the while loop that follows the first pipe. More importantly, I think it’s a lot more readable while still demonstrating how pipes can be efficiently used to process any amount of FLAC files.<p>[1] <a href="http://mywiki.wooledge.org/ParsingLs" rel="nofollow">http://mywiki.wooledge.org/ParsingLs</a>
Neat. I have my own take on this concept[1] using Redis pub/sub instead of queues. Tradeoffs involve being able to lose data if endpoints aren't connected, but you do get the benefit of having multiple inputs and outputs on one stream, which was important for my use case.<p>[1] <a href="https://github.com/whee/rp" rel="nofollow">https://github.com/whee/rp</a>
I recently came across a language called Factor that works very similar, if not identical, to this.<p>Here's a video about it: <a href="https://www.youtube.com/watch?v=f_0QlhYlS8g" rel="nofollow">https://www.youtube.com/watch?v=f_0QlhYlS8g</a>
<p><pre><code> _____________
< unimpurpled >
-------------
\
\
</code></pre>
Part way through my second viewing of the article, I thought, "what is 'unimpurpled'". Wiktionary didn't know. Google doesn't return useful results for it, even. M-W, finally, clued me in: it's an obsolete term, with an "un" prefix, for the verb "empurple", which means to make purple[1].<p>[1] And a few similar things. <a href="https://en.wiktionary.org/wiki/empurple" rel="nofollow">https://en.wiktionary.org/wiki/empurple</a>
I've been playing around with julia[1] this week and discovered the inclusion of a pipe-like operator that removes a lot of the parentheses from functional programming; you can write,<p><pre><code> x |> a|> b |> c|>s->d(s,y)|>e|>...
</code></pre>
in julia instead of<p><pre><code> e(d(c(b(a(x))),y)) or (e (d (c (b (a x)) y))
</code></pre>
...or whatever is your flavour. I reckon it is impossible to make a serious case against that readability gain.<p>[1] julialang.org
> I’m calling /usr/bin/time here to avoid using my shell’s built-in time command [...].<p>I prefer to use command in this scenario.<p>$ command time
Pipes are also related to the IO monad, similar in a way with the jQuery syntax which is another hugely popular case. I am utterly amazed how they invented such a powerful concept of functional programming for the shell (well, not 100% pure functional, if there are side effects)
The first few parts make a pretty good pipeline preach. Two interesting points about pipelines are the concurrency and the difficulty to handle errors (at least without cluttering the syntax), which are often missed by newcomers.
The author has earned him/herself a Useless Use of cat (UUOC) award for not realizing that grep can take a filename argument in the example pipeline.<p>Basically, the example can be shortened to the following:<p><pre><code> grep purple /usr/share/dict words | # Find words containing 'purple' in the system's dictionary
awk '{print length($1), $1}' | # Count the letters in each word
sort -n | # Sort lines ("${length} ${word}")
tail -n 1 | # Take the last line of the input
cut -d " " -f 2 | # Take the second part of each line
cowsay -f tux # Put the resulting word into Tux's mouth</code></pre>