Pipes and Filters

223 pointsby rbcover 10 years ago

17 comments

js2over 10 years ago

Some random thoughts:• Though the first pipeline is didactic, it can be done entirely within awk:<pre><code> awk ' BEGIN { l=0 } /purple/ { if(length($1) >= l) { word = $1; l = length($1) } } END { print word }' < /usr/share/dict/words </code></pre> • Named pipes are neat, but you can also use subshells and additional FDs (I am in no way arguing this is more clear):<pre><code> ( ( ( echo out echo err >&2 ) | while read out; do echo "out: $out"; done >&3 ) 2>&1 | while read err; do echo "err: $err"; done ) 3>&1 </code></pre> • Bash has "set -o pipefail" for cases where you want any process in the pipeline that exits non-zero to cause the entire pipeline to exit non-zero.

robertduncanover 10 years ago

Error detection is much easier with pipefail:<a href="http://www.gnu.org/software/bash/manual/html_node/Pipelines.html" rel="nofollow">http://www.gnu.org/software/bash/manual/html_node/Pipelines....</a>"If pipefail is enabled, the pipeline’s return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully"

评论 #8328040 未加载

bkirwiover 10 years ago

If you like pipes, you'll love Pipe Viewer:<a href="http://www.ivarch.com/programs/pv.shtml" rel="nofollow">http://www.ivarch.com/programs/pv.shtml</a>> pv - Pipe Viewer - is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.

jstschover 10 years ago

Another fun trick, by piping through dd you can add a buffer between processes.Example: the raspberry pi has pretty slow SD performance and the USB bus can get hogged. If you record audio and want to encode + write it to SD you can easily get buffer overruns. Easily solved by a 10sec buffer between arecord and flac in my case.

评论 #8329267 未加载

评论 #8328445 未加载

评论 #8327765 未加载

cghover 10 years ago

Reminds me of the classic David Beazley course on coroutines: <a href="http://www.dabeaz.com/coroutines/" rel="nofollow">http://www.dabeaz.com/coroutines/</a>It highlights a similar pipeline-oriented architecture and eventually ends up being sort of mindblowing.

评论 #8327775 未加载

brianpgordonover 10 years ago

I hate to be "that guy" but someone has to say something about the Useless Use of Cat.<a href="http://www.catb.org/jargon/html/U/UUOC.html" rel="nofollow">http://www.catb.org/jargon/html/U/UUOC.html</a>

评论 #8327967 未加载

评论 #8327633 未加载

daveloyallover 10 years ago

Off topic: Today I learned <a href="http://dcurt.is/unkudo" rel="nofollow">http://dcurt.is/unkudo</a>. Peter Sobot, I want my kudos back. (Not that I didn't really appreciate learning about ${PIPESTATUS[*]}.)

评论 #8327298 未加载

评论 #8328619 未加载

评论 #8329162 未加载

评论 #8328047 未加载

Anthony-Gover 10 years ago

I’ve used pipes for years and had a pretty good understanding of how the system calls of each process in the pipeline interacts with their own `sdin` and `stdout` file descriptors but this article puts it all together really nicely with some good examples.I don’t mind the useless use of `cat` as it can enhance readability for some people. However, I would suggest replacing the Bash while loop with a for loop:<pre><code> ls *.flac | while read song do flac -d "$song" --stdout | lame -V2 - "$song".mp3 done for song in *.flac do flac -d "$song" --stdout | lame -V2 - "$song".mp3 done </code></pre> Using Bash’s file globbing avoids problems with enumerating the output of `ls` [1]. It also avoids an unnecessary Bash sub-shell being spawned for the while loop that follows the first pipe. More importantly, I think it’s a lot more readable while still demonstrating how pipes can be efficiently used to process any amount of FLAC files.[1] <a href="http://mywiki.wooledge.org/ParsingLs" rel="nofollow">http://mywiki.wooledge.org/ParsingLs</a>

wheeover 10 years ago

Neat. I have my own take on this concept[1] using Redis pub/sub instead of queues. Tradeoffs involve being able to lose data if endpoints aren't connected, but you do get the benefit of having multiple inputs and outputs on one stream, which was important for my use case.[1] <a href="https://github.com/whee/rp" rel="nofollow">https://github.com/whee/rp</a>

daemonizeover 10 years ago

I recommend highland.js for node <a href="http://highlandjs.org" rel="nofollow">http://highlandjs.org</a>

runeksover 10 years ago

I recently came across a language called Factor that works very similar, if not identical, to this.Here's a video about it: <a href="https://www.youtube.com/watch?v=f_0QlhYlS8g" rel="nofollow">https://www.youtube.com/watch?v=f_0QlhYlS8g</a>

评论 #8327454 未加载

评论 #8327464 未加载

deathanatosover 10 years ago

<pre><code> _____________ < unimpurpled > ------------- \ \ </code></pre> Part way through my second viewing of the article, I thought, "what is 'unimpurpled'". Wiktionary didn't know. Google doesn't return useful results for it, even. M-W, finally, clued me in: it's an obsolete term, with an "un" prefix, for the verb "empurple", which means to make purple[1].[1] And a few similar things. <a href="https://en.wiktionary.org/wiki/empurple" rel="nofollow">https://en.wiktionary.org/wiki/empurple</a>

tomggover 10 years ago

I've been playing around with julia[1] this week and discovered the inclusion of a pipe-like operator that removes a lot of the parentheses from functional programming; you can write,<pre><code> x |> a|> b |> c|>s->d(s,y)|>e|>... </code></pre> in julia instead of<pre><code> e(d(c(b(a(x))),y)) or (e (d (c (b (a x)) y)) </code></pre> ...or whatever is your flavour. I reckon it is impossible to make a serious case against that readability gain.[1] julialang.org

评论 #8329756 未加载

评论 #8329678 未加载

评论 #8330566 未加载

评论 #8329844 未加载

评论 #8329714 未加载

pdknskover 10 years ago

> I’m calling /usr/bin/time here to avoid using my shell’s built-in time command [...].I prefer to use command in this scenario.$ command time

visargaover 10 years ago

Pipes are also related to the IO monad, similar in a way with the jQuery syntax which is another hugely popular case. I am utterly amazed how they invented such a powerful concept of functional programming for the shell (well, not 100% pure functional, if there are side effects)

评论 #8328708 未加载

xiaqover 10 years ago

The first few parts make a pretty good pipeline preach. Two interesting points about pipelines are the concurrency and the difficulty to handle errors (at least without cluttering the syntax), which are often missed by newcomers.

yellowappleover 10 years ago

The author has earned him/herself a Useless Use of cat (UUOC) award for not realizing that grep can take a filename argument in the example pipeline.Basically, the example can be shortened to the following:<pre><code> grep purple /usr/share/dict words | # Find words containing 'purple' in the system's dictionary awk '{print length($1), $1}' | # Count the letters in each word sort -n | # Sort lines ("${length} ${word}") tail -n 1 | # Take the last line of the input cut -d " " -f 2 | # Take the second part of each line cowsay -f tux # Put the resulting word into Tux's mouth</code></pre>