TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GNU Parallel Cheat Sheet [pdf]

119 pointsby ole_tangeabout 6 years ago

11 comments

bonoboTPabout 6 years ago
Ah, the rare case of nagware in GNU.<p>From the man page:<p>&quot;--citation Print the BibTeX entry for GNU parallel and silence citation notice. If it is impossible for you to run --bibtex you can use --will-cite. If you use --will-cite in scripts to be run by others you are making it harder for others to see the citation notice. The development of GNU parallel is indirectly financed through citations, so if your users do not know they should cite then you are making it harder to finance development. However, if you pay 10000 EUR, you should feel free to use --will-cite in scripts.&quot;<p>Asking for donations&#x2F;citations is one thing, but putting this junk in the man page about 10000 EUR and nagging users is quite an annoyance. How GNU allows such junk in their man pages puzzles me. Obviously the GPL allows one to remove the nagware and redistribute, but I don&#x27;t know if anyone has forked it.
评论 #19330985 未加载
评论 #19331014 未加载
评论 #19330829 未加载
评论 #19331379 未加载
评论 #19332440 未加载
评论 #19334676 未加载
评论 #19332169 未加载
gcommerabout 6 years ago
A few slightly more advanced GNU Parallel features that I&#x27;ve used:<p>- --joblog writes out a detailed logfile of the jobs, which can be used to resume from interrupted runs with --resume{,-failed}<p>- `--slf filename` can be used to provide a list of ssh logins to remote worker nodes to run jobs. Importantly, parallel will automatically reread this list when it changes. This lets you very easily distribute batch jobs across preemptible gcloud vms (or ec2 spot instances) and gracefully handle worker nodes appearing&#x2F;disappearing with just a few lines of bash <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;gpittarelli&#x2F;5e14fb772ce0230a3c40ffad2c2262be" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;gpittarelli&#x2F;5e14fb772ce0230a3c40ffad...</a><p>- When used with bash, parallel can run bash functions if you export them with `export -f functionName` .
评论 #19331043 未加载
评论 #19330722 未加载
mrutsabout 6 years ago
I&#x27;ve never used GNU Parallel. But could someone explain to me the value add vs GNU xargs -P&#x2F;--max-procs? From the examples at the top, it seems like those could be achieved with xargs.
评论 #19330748 未加载
评论 #19330775 未加载
评论 #19334593 未加载
评论 #19330762 未加载
评论 #19330964 未加载
评论 #19331155 未加载
评论 #19330773 未加载
评论 #19333113 未加载
bloopernovaabout 6 years ago
Parallel is Good Stuff (tm) and works very well but I haven&#x27;t had much cause to use it.<p>For ad-hoc system modifications I&#x27;ve found myself using tmux&#x27;s synchronize-panes feature, or xargs. For anything bigger or more involved then I break out Ansible&#x2F;Chef&#x2F;Puppet depending on which client project I&#x27;m working on.<p>I remember one place I worked at had a huge elaborate configuration&#x2F;deployment system hand written by the head IT guy which used Parallel+bash+perl extensively. Thing is, while it was a great system, I could make the same changes in Ansible or Puppet with a couple of lines and push them within minutes, while making changes using the hand written system might take hours. Plus no logging and poor error handling led to all sorts of problems with that system, despite it being a real labour of love by that wacky Finnish dude.<p>However this sheet is really nice because it is just one side of a letter&#x2F;A4 piece of paper and lays out the information clearly. I definitely want to mess around with Parallel now because of this cheat sheet. I wonder how it was typeset or laid out on the page? I try to write my own cheat sheets but they always seem way too sparse with too much white space. Maybe it is written in LaTeX or similar.
评论 #19332271 未加载
jason_slackabout 6 years ago
I use GNU Parallel for pulling stock data from various sources, massaging it, creating flatfiles of the data, creating models of the data, etc.<p>I also use it as a rudimentary queue system for stacking up the next jobs (while scripts stack up the next jobs, but..).<p>It had a bit of a learning curve because the docs are really technical and not geared towards new users enough, but reading and re-reading and trying some examples helped cement.<p>Here are a few ways I use it:<p>echo &quot;Number of RAR archives: &quot;$(ls <i>.rar | wc -l)<p>ls </i>.rar | parallel -j0 1_1_rarFilesExtraction<p>ls -d stocks_all&#x2F;Intraday&#x2F;*.txt | parallel -j${ccj}% 1_2_stockFileProcessing {}<p>I&#x27;d like to scale this to work with multiple machines (as Parallel can do) but I get really tempted to just write my own parallel processor just to rely on my own code.
scrummyinabout 6 years ago
My favorite parallels command `$ find ~&#x2F;Source&#x2F;folder -name .git | parallel &quot;cd {}&#x2F;.. ; git pull ; git checkout -b new_branch&quot; `
akramerabout 6 years ago
Each time I&#x27;ve seen something about GNU parallel pop up I&#x27;ve been tempted to post, but I&#x27;ve never made an account until now.<p>I wrote a very different style of command parallelizer that I named lateral. It doesn&#x27;t require constructing elaborate commandlines that define all of your work at once. You start a server, and separate invocations of &#x27;lateral run&#x27; add your commands to a queue to run on the server, including their filedescriptors. It makes for easier parallelization of complex arguments.<p>Take a look if this sort of thing interests you, as I haven&#x27;t seen anyone write one like this before. Its primary difference is the ease with which each separate command can output to its own log, and the lack of need to play games with shell quoting and positional arguments.<p>Check it out: <a href="https:&#x2F;&#x2F;github.com&#x2F;akramer&#x2F;lateral" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;akramer&#x2F;lateral</a>
评论 #19336901 未加载
评论 #19335212 未加载
res0nat0rabout 6 years ago
Lots of good examples also here: <a href="https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;parallel&#x2F;man.html" rel="nofollow">https:&#x2F;&#x2F;www.gnu.org&#x2F;software&#x2F;parallel&#x2F;man.html</a>
Mizzaabout 6 years ago
If you&#x27;re using GNU Parallel for simple, non-parallel command line tasks and scripting, I&#x27;ve written a tool which I find to be much more intuitive:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;Miserlou&#x2F;Loop" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Miserlou&#x2F;Loop</a><p>The author of GNU Parallel wrote a pretty detailed comparison, which you can find in the linked README.
评论 #19331056 未加载
devyabout 6 years ago
Is there a Rust port for GNU parallel? It&#x27;s written in Perl and having to install dependencies for Perl is not as simple as download a binary :)
评论 #19333647 未加载
hprotagonistabout 6 years ago
Still often the simplest way to get parallel computation in python, sadly.
评论 #19331216 未加载
评论 #19335814 未加载