TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

A common mistake involving wildcards and the find command

237 点作者 robertelder超过 5 年前

38 条评论

svnpenn超过 5 年前
As another user mentioned, many POSIX and&#x2F;or GNU utilities havent aged well. I respect trying to stay portable, but peoples needs change over time and these tools simply havent kept up. Like the other user, I use Fd now instead:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;sharkdp&#x2F;fd" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;sharkdp&#x2F;fd</a><p>As well as Silver Searcher:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;ggreer&#x2F;the_silver_searcher" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ggreer&#x2F;the_silver_searcher</a><p>while Grep performance is pretty good, its also gotten pretty stale with regard to its defaults and options.
评论 #22279873 未加载
评论 #22279191 未加载
评论 #22280171 未加载
评论 #22280652 未加载
评论 #22280145 未加载
评论 #22282123 未加载
评论 #22279913 未加载
评论 #22281591 未加载
评论 #22281544 未加载
twic超过 5 年前
I often set the nullglob option in scripts, because it makes the handling of globs which don&#x27;t match anything a bit more predictable:<p><a href="http:&#x2F;&#x2F;bash.cumulonim.biz&#x2F;NullGlob.html" rel="nofollow">http:&#x2F;&#x2F;bash.cumulonim.biz&#x2F;NullGlob.html</a><p>There&#x27;s a note at the end about how with nullglob set, ls on a glob with no matches does something surprising. This is a great illustration of how an empty list and the absence of a list are different. Sadly it&#x27;s rather hard to make that distinction in shells!<p>I do wish that either shells had a more explicit syntax for globbing, or other commands didn&#x27;t use the same syntax for patterns. Then confusion like this couldn&#x27;t occur. An example of the former would be if you had to write:<p><pre><code> ls $(glob *.txt) </code></pre> Here, the shell would not treat * specially, but rather you would have to explicitly expand it. This would be a pain, but at least you wouldn&#x27;t do it by mistake!
评论 #22279304 未加载
评论 #22279978 未加载
评论 #22279618 未加载
nothrabannosir超过 5 年前
Tangential: another safeguard you can adopt is avoiding &quot;hard delete&quot; commands like rm and find -delete. Untrain yourself from these commands by never using them. On Mac systems, the &quot;trash&quot; program (brew install trash) sends files to your system trash. You can use `trash [file]` and `find .. -print0 | xargs -0 trash --`. rm is a dangerous command you should only very rarely be using.<p>I fish something out of the trash a few times a year and lemme tell ya; it&#x27;s worth the investment.<p>Another tip if you fancy debugging shells is using<p><pre><code> python -c &quot;print(__import__(&#x27;sys&#x27;).argv[1:])&quot; sample &quot;arg here&quot; * foo </code></pre> this provides the same functionality as the c program in TFA, without needing gcc.
评论 #22279832 未加载
评论 #22279678 未加载
评论 #22280945 未加载
评论 #22281251 未加载
tzs超过 5 年前
For a long time (probably since the first time I forgot to quote something and got burned, so around 40 years), I&#x27;ve thought that there should be some mechanism for the shell to pass in information about how each argument came about.<p>For each argument, it would tell the program if it was supplied directly, or came from wildcard expansion. For those from wildcard expansion, it would tell the program what the wildcard was.<p>Most programs would not care, but some programs could use this to catch common quoting errors.
评论 #22279477 未加载
评论 #22279812 未加载
评论 #22280113 未加载
评论 #22285613 未加载
dehrmann超过 5 年前
Shell (because this is technically a shell, not a find issue) is the worst language that everyone should learn. It&#x27;s a language you&#x27;ll actually encounter, and it&#x27;s one that&#x27;s hard to avoid (unlike PHP).
seiferteric超过 5 年前
This was obvious to me, but one version of this that surprised me is when using scp. If you glob a remote destination like &quot;scp myserver:*.jpg .&#x2F;&quot; It will probably work! But how? Because the remote path will likely not match any local files and the path with the asterisk will be passed to scp and scp will do the globbing on the remote side.
评论 #22279405 未加载
ulrikrasmussen超过 5 年前
I believe that programming languages should never make the meaning of a program depend on the context in which it is executed. So many obscure bugs are directly caused by such behaviors. There should be exactly one possible interpretation for a given statement, and if that cannot be executed, then the program should abort. In this case, the glob should never have been passed on to find. It should either have expanded to the empty array, or failed.
评论 #22281063 未加载
umanwizard超过 5 年前
My instant reaction to the example was “that won’t work; you’re shell will say something like ‘no matches’”.<p>Using an unescaped star in a find command <i>never</i> works for me, which is a lot better than it sometimes working and sometimes breaking!<p>Reading the article and the comments, it seems like bash doesn’t do this? I suppose it’s one nice thing about oh-my-zsh, whose default confit I use almost unchanged.
评论 #22280346 未加载
评论 #22281199 未加载
mehrdada超过 5 年前
Straight from <i>The UNIX-HATERS Handbook</i>. <a href="https:&#x2F;&#x2F;web.mit.edu&#x2F;~simsong&#x2F;www&#x2F;ugh.pdf" rel="nofollow">https:&#x2F;&#x2F;web.mit.edu&#x2F;~simsong&#x2F;www&#x2F;ugh.pdf</a>
评论 #22280900 未加载
thundergolfer超过 5 年前
Putting ‘shellcheck’ in your CI pipeline is a must for me now, after one too many mistakes.<p>I just finished cleaning away all existing ‘error’ and ‘warning’ level issues in our codebase so that the ‘shellcheck’ CI step can be really strict on code quality.
arcade79超过 5 年前
Subject of the post is wrong. This is a common mistake between the user and whichever shell the user is using, not the user and the command itself.<p>The find command works exactly as expected.
thunderbong超过 5 年前
Phew! I&#x27;m glad I&#x27;ve been hitting the &quot;Happy Case&quot; scenario all these years!<p>Very useful article. And very informative.<p>Summary:<p>Instead of -<p><pre><code> find . -name *.jpg </code></pre> Use quotes around pattern i.e.<p><pre><code> find . -name &#x27;*.jpg&#x27; </code></pre> Edit: Oops, the double-quotes should have been single quotes! Thanks, @lucd. Happy case, like I said!
评论 #22285304 未加载
评论 #22279109 未加载
mynegation超过 5 年前
Common? Yes. Simple enough to stop doing this mistake after two times? Also yes. One you internalize in which case shell is responsible for globbing and in which case command itself, it is pretty clear cut.
vikinghckr超过 5 年前
`find` has one of the worst user experiences out of UNIX tools. I prefer to use `find . | grep foo` to find files.
StreamBright超过 5 年前
Isn&#x27;t this in the Unix hater manual? I never use -name without &#x27;&#x27;. I guess this is just muscle memory from early on when I run into this issue that in Unix *.py can mean very different things depending on where it gets resolved.
madsbuch超过 5 年前
That was a lot of text to explain that one should be cautious of the wildcard expansion some shells provide.<p>Thanks! I would have jumped right in!
joana035超过 5 年前
The title seems a bit off since shell expansions and arguments has nothing to do with the find command.<p>Both features are also often covered in entry level material for introduction to shell.
评论 #22279475 未加载
评论 #22279380 未加载
siddharthgoel88超过 5 年前
It is just awesome that I stumbled upon this post. I remember previously I had faced similar issue while running a command like<p><pre><code> find . -name *.gradle | blah blah </code></pre> Instead of finding the root cause, I by-passed it by<p><pre><code> find . | grep &quot;\.gradle&quot; | blah blah </code></pre> It just feels great to now connect the dot and know the real reason for the issue.
sys_64738超过 5 年前
find . -name \*.jpg<p>This is pretty elementary which any seasoned Linux person should know.
jancsika超过 5 年前
&gt; You can type: man glob<p>Translation: you cannot type &quot;man&quot; followed by an asterisk because that would have required forethought in how one learns a programming language.<p>Argle: Hey Bargle, is that new bridge built to spec?<p>Bargle: It&#x27;s like I always say, man: <i>good enough for shell script manual operator discoverability</i>.<p>Argle: Yeah, you&#x27;re always saying that...
rootusrootus超过 5 年前
I almost appreciate the idea of writing a C program whose sole purpose is to show you the arguments send to it. That&#x27;s some serious overkill.<p>But I think echo *.py would not just be easier, but more effective at demonstrating what your find command line will actually look like after shell expansion.
评论 #22281754 未加载
tzs超过 5 年前
Another one I&#x27;ve seen a few times with find (although it is actually more an xargs thing than a find thing), usually not with any bad consequences at least, is something like this:<p>find . -type f | xargs grep foo<p>You expect to see all the &quot;foo&quot; lines from your files, each prefixed with the file name and a colon. And that&#x27;s what you get most of the time.<p>But sometimes you might get a foo line without the filename prefix.<p>Why? Because grep only adds the filename prefix when there is more than one filename argument. There are two ways that might come about in the above command.<p>The first is if the find only finds one file.<p>The second is if the find finds so many files that xargs has to invoke grep more than once. It can happen that there is only one file left to do when it gets to the final grep invocation.<p>Simple fix:<p>find . -type f | xargs grep foo &#x2F;dev&#x2F;null
评论 #22279550 未加载
JosefAssad超过 5 年前
Do people really use find like that to clean up source code?<p><pre><code> git clean -fxdn # review what would be deleted, then git clean -fxd</code></pre>
评论 #22281195 未加载
tyingq超过 5 年前
Globbing is also more complex than it seems on first blush.<p>Alternation is supported in bash. Stuff like <i>&quot;echo ∗.{png,jp{e,}g}&quot;</i> (utf8 asterisk to get around HN, cut&#x2F;paste won&#x27;t work).<p>That bsd derived glob is useful for some sometimes useful tricks. Like in Perl:<p>use File::Glob qw&#x2F;bsd_glob&#x2F;;<p>my @list = bsd_glob(&#x27;This {list|stuff} is nested {{very,quite} deeply,deep}&#x27;);
评论 #22281873 未加载
tigrezno超过 5 年前
Apparently, the OP read the man page of &quot;<i>glob</i>&quot; but couldn&#x27;t spend a single minute reading the man page of <i>FIND</i>.<p><pre><code> -name pattern ... Don&#x27;t forget to enclose the pattern in quotes in order to protect it from expansion by the shell. </code></pre> So, a lengthy article on an already documented feature of <i>find</i>.
alleycat5000超过 5 年前
One of my favorite features of fish shell is the <i></i> wildcard expansion:<p><a href="https:&#x2F;&#x2F;fishshell.com&#x2F;docs&#x2F;current&#x2F;tutorial.html#tut_wildcards" rel="nofollow">https:&#x2F;&#x2F;fishshell.com&#x2F;docs&#x2F;current&#x2F;tutorial.html#tut_wildcar...</a><p>It&#x27;s rare I reach for find nowadays since switching.
7oi超过 5 年前
Also, a good thing to remember is to mind the order of your options. I know from experience that it matters a lot after thinking once: &quot;I can just place this `-delete` option wherever, right?&quot; and using it as my first option. Needless to say, I had a very bad time.
NotSammyHagar超过 5 年前
It is unexpected to new users, principle of lead surprise should show it&#x27;s not a great idea. At the same time, globbing is a basic concept for beginning shell users. I don&#x27;t know how I learned about it, if someone told me or something.
mangix超过 5 年前
I wonder if this is also a problem with fish. The globing there is different.
评论 #22279115 未加载
评论 #22279086 未加载
fouc超过 5 年前
I always write it like this:<p><pre><code> find . -name \*.jpg </code></pre> escape the glob
dkarras超过 5 年前
I also have the habit of always using -iname instead of -name to make case insensitive searches.
评论 #22279080 未加载
enriquto超过 5 年前
Find is an ugly beast. I only use it to print all the files and then grep my way on.
emmelaich超过 5 年前
Quiz; in any directory, full or empty I can run this command<p>$ *<p>and get<p>*<p>What allows me to do this? (the $ is the prompt, not what I typed)
评论 #22280063 未加载
评论 #22279742 未加载
评论 #22281012 未加载
shmerl超过 5 年前
TL;DR: globbing kicks in before other things unless you turn it off like with single quotes:<p><a href="https:&#x2F;&#x2F;www.tldp.org&#x2F;LDP&#x2F;abs&#x2F;html&#x2F;globbingref.html" rel="nofollow">https:&#x2F;&#x2F;www.tldp.org&#x2F;LDP&#x2F;abs&#x2F;html&#x2F;globbingref.html</a><p>Don&#x27;t forget about globbing when making Bash scripts.
评论 #22280834 未加载
vmchale超过 5 年前
I use fd-find and I don&#x27;t really miss find.
monkeycantype超过 5 年前
at my workplace I use git bash on windows, and because i&#x27;m always getting man: command not found I flick to a browser and type man &lt;whatever I was looking for&gt;<p>About once a year I forget what happened all the previous times I typed<p><pre><code> man find </code></pre> into google.<p>TLDR: looking for Dennis Ritchie, I found Chuck Tingle.
Lio超过 5 年前
<p><pre><code> find . -name *.jpg </code></pre> TL;DR if you don’t wrap your wildcard expressions in quotes the shell will expand them.<p><pre><code> find . -name ‘*.jpg’ </code></pre> so wrap them in single quotes so that they make their way to the find command unexpanded.
jasonhansel超过 5 年前
This is the <i>one</i> thing I like about DOS&#x2F;cmd.exe. In those shells, wildcard expansion is done by applications instead of the shell itself, so there&#x27;s no need to resort to hacks like this.
评论 #22281776 未加载