TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Cheating at a company group activity using Unix tools

139 pointsby devenvdevover 3 years ago

11 comments

l0b0over 3 years ago
<p><pre><code> ls | grep &#x27;.csv$&#x27; | xargs cat | grep &#x27;cake&#x27; | cut -d, -f2,3 &gt; cakes.csv </code></pre> That&#x27;s quite a few antipatterns in one go. Unless you have a bajillion files the `xargs` is unnecessary, the `cat` and `ls` are unnecessary (and `ls` in shell scripts is a whole class of antipatterns by itself). You might want to use something like this instead:<p><pre><code> grep cake *.csv | cut -d, -f2,3 &gt; cakes.csv</code></pre>
评论 #29538548 未加载
评论 #29538363 未加载
评论 #29540980 未加载
评论 #29538810 未加载
评论 #29538400 未加载
评论 #29540334 未加载
评论 #29542398 未加载
mattrighettiover 3 years ago
For those interested in this topic I would suggest these incredible lectures by MIT [0], especially the data wrangling one.<p>Lectures are hosted on YouTube, they are extremely valuable and easy to follow and they give a pretty good insight on a lot of Unix topics.<p>[0]: <a href="https:&#x2F;&#x2F;missing.csail.mit.edu&#x2F;2020&#x2F;" rel="nofollow">https:&#x2F;&#x2F;missing.csail.mit.edu&#x2F;2020&#x2F;</a>
tzsover 3 years ago
The &#x27;comm&#x27; command should be in there. With no options &#x27;comm&#x27; takes two files, F1 and F2, which should be lexically sorted, and produces 3 columns of output.<p>The first column consists of lines that are only in F1, the second column consist of lines that are only in F2, and the third column consists of lines that are common to both files.<p>The option -1 tells it to not print column 1, -2 tells it not to print column 2, and -3 does the same for column 3. These can be combined, so -12 would only print column 3 (the lines that are in both files) and -13 would only print column 2 (the lines that are in F2 but not F1).
评论 #29550432 未加载
pkruminsover 3 years ago
The first example is super super bad here. Never pipe `ls`. When you feel like you need to pipe `ls`, then you know you want to use `find`.
评论 #29538572 未加载
评论 #29540768 未加载
评论 #29540985 未加载
评论 #29540338 未加载
评论 #29545398 未加载
评论 #29544755 未加载
dsr_over 3 years ago
&quot;After some digging, it was easy to find the HTTP request that pulled this information from the server. And it even had all the birthdates in the JSON!&quot;<p>HR needs to know this, but it shouldn&#x27;t be available to random employees.
评论 #29539799 未加载
perryizgr8over 3 years ago
&gt; regex is so ubiquitous and valuable that if you don’t know it yet, you should learn it)<p>Regex is one of those things I have to learn every single time I need to use it. I just can&#x27;t seem to force myself to remember.
amtamtover 3 years ago
Extension of classic problem from &quot;Programming Pearls&quot; by John Bentley. Nice to see such pragmatism for one time problems.
评论 #29543940 未加载
smitty1eover 3 years ago
For doing work with JSON data, I&#x27;d add:<p><a href="https:&#x2F;&#x2F;stedolan.github.io&#x2F;jq&#x2F;" rel="nofollow">https:&#x2F;&#x2F;stedolan.github.io&#x2F;jq&#x2F;</a>
评论 #29540820 未加载
unixbaneover 3 years ago
jq ... | sed -E &#x27;s&#x2F;([0-9][0-9]).([0-9][0-9]).[0-9]*$&#x2F;\2_\1&#x2F;&#x27;<p>this fails for me since the jq output lines are surrounded by quotes. had to remove $. did i do something different or are we running different jq versions?
unixbaneover 3 years ago
&gt; Was it worth it?<p>&gt; 1 minute to do this<p>&gt; 1 minute to do that<p>and 1 minute to introduce RCE vulns into company #589179283672&#x27;s pipeline due to the &quot;you don&#x27;t understand the security implications of using fragile UN*X tools&quot; problem which applies to anyone actually learning something from this article DAY OF THE SEAL SOON,
评论 #29564427 未加载
ccallowayover 3 years ago
Most of the justifications for using collections of command-line Unix tools are no longer valid today. Instead you should be using a proper programming language.<p>Note that people who still do use complex solutions built from cat, head, cut, etc, and who know what they&#x27;re doing, will typically either write a shell script (which won&#x27;t be structured particularly differently from the equivalent Python or whatever) or will rely heavily on awk (itself a full-featured programming language, no easier to learn than any other scripting language), or both.<p>One-liners which pipe text between four or five different commands are the equivalent of hand-soldered boards or bitwise arithmetic. Interesting to learn about for historical reasons but of no practical utility.<p>The use of things like xargs and jq in this solution, difficult to invoke Unix utilities for doing things that are trivial in any reasonable language, makes this even more clear.
评论 #29538546 未加载
评论 #29540046 未加载
评论 #29538533 未加载
评论 #29539456 未加载
评论 #29538475 未加载
评论 #29538469 未加载
评论 #29538247 未加载
评论 #29538385 未加载
评论 #29538926 未加载
评论 #29539684 未加载
评论 #29538718 未加载
评论 #29538551 未加载
评论 #29538271 未加载
评论 #29540052 未加载
评论 #29539971 未加载