As another user mentioned, many POSIX and/or GNU utilities havent aged well.
I respect trying to stay portable, but peoples needs change over time and these
tools simply havent kept up. Like the other user, I use Fd now instead:<p><a href="https://github.com/sharkdp/fd" rel="nofollow">https://github.com/sharkdp/fd</a><p>As well as Silver Searcher:<p><a href="https://github.com/ggreer/the_silver_searcher" rel="nofollow">https://github.com/ggreer/the_silver_searcher</a><p>while Grep performance is pretty good, its also gotten pretty stale with regard
to its defaults and options.
I often set the nullglob option in scripts, because it makes the handling of globs which don't match anything a bit more predictable:<p><a href="http://bash.cumulonim.biz/NullGlob.html" rel="nofollow">http://bash.cumulonim.biz/NullGlob.html</a><p>There's a note at the end about how with nullglob set, ls on a glob with no matches does something surprising. This is a great illustration of how an empty list and the absence of a list are different. Sadly it's rather hard to make that distinction in shells!<p>I do wish that either shells had a more explicit syntax for globbing, or other commands didn't use the same syntax for patterns. Then confusion like this couldn't occur. An example of the former would be if you had to write:<p><pre><code> ls $(glob *.txt)
</code></pre>
Here, the shell would not treat * specially, but rather you would have to explicitly expand it. This would be a pain, but at least you wouldn't do it by mistake!
Tangential: another safeguard you can adopt is avoiding "hard delete" commands like rm and find -delete. Untrain yourself from these commands by never using them. On Mac systems, the "trash" program (brew install trash) sends files to your system trash. You can use `trash [file]` and `find .. -print0 | xargs -0 trash --`. rm is a dangerous command you should only very rarely be using.<p>I fish something out of the trash a few times a year and lemme tell ya; it's worth the investment.<p>Another tip if you fancy debugging shells is using<p><pre><code> python -c "print(__import__('sys').argv[1:])" sample "arg here" * foo
</code></pre>
this provides the same functionality as the c program in TFA, without needing gcc.
For a long time (probably since the first time I forgot to quote something and got burned, so around 40 years), I've thought that there should be some mechanism for the shell to pass in information about how each argument came about.<p>For each argument, it would tell the program if it was supplied directly, or came from wildcard expansion. For those from wildcard expansion, it would tell the program what the wildcard was.<p>Most programs would not care, but some programs could use this to catch common quoting errors.
Shell (because this is technically a shell, not a find issue) is the worst language that everyone should learn. It's a language you'll actually encounter, and it's one that's hard to avoid (unlike PHP).
This was obvious to me, but one version of this that surprised me is when using scp. If you glob a remote destination like "scp myserver:*.jpg ./" It will probably work! But how? Because the remote path will likely not match any local files and the path with the asterisk will be passed to scp and scp will do the globbing on the remote side.
I believe that programming languages should never make the meaning of a program depend on the context in which it is executed. So many obscure bugs are directly caused by such behaviors. There should be exactly one possible interpretation for a given statement, and if that cannot be executed, then the program should abort. In this case, the glob should never have been passed on to find. It should either have expanded to the empty array, or failed.
My instant reaction to the example was “that won’t work; you’re shell will say something like ‘no matches’”.<p>Using an unescaped star in a find command <i>never</i> works for me, which is a lot better than it sometimes working and sometimes breaking!<p>Reading the article and the comments, it seems like bash doesn’t do this? I suppose it’s one nice thing about oh-my-zsh, whose default confit I use almost unchanged.
Putting ‘shellcheck’ in your CI pipeline is a must for me now, after one too many mistakes.<p>I just finished cleaning away all existing ‘error’ and ‘warning’ level issues in our codebase so that the ‘shellcheck’ CI step can be really strict on code quality.
Subject of the post is wrong. This is a common mistake between the user and whichever shell the user is using, not the user and the command itself.<p>The find command works exactly as expected.
Phew! I'm glad I've been hitting the "Happy Case" scenario all these years!<p>Very useful article. And very informative.<p>Summary:<p>Instead of -<p><pre><code> find . -name *.jpg
</code></pre>
Use quotes around pattern i.e.<p><pre><code> find . -name '*.jpg'
</code></pre>
Edit: Oops, the double-quotes should have been single quotes! Thanks, @lucd. Happy case, like I said!
Common? Yes. Simple enough to stop doing this mistake after two times? Also yes. One you internalize in which case shell is responsible for globbing and in which case command itself, it is pretty clear cut.
Isn't this in the Unix hater manual? I never use -name without ''. I guess this is just muscle memory from early on when I run into this issue that in Unix *.py can mean very different things depending on where it gets resolved.
The title seems a bit off since shell expansions and arguments has nothing to do with the find command.<p>Both features are also often covered in entry level material for introduction to shell.
It is just awesome that I stumbled upon this post. I remember previously I had faced similar issue while running a command like<p><pre><code> find . -name *.gradle | blah blah
</code></pre>
Instead of finding the root cause, I by-passed it by<p><pre><code> find . | grep "\.gradle" | blah blah
</code></pre>
It just feels great to now connect the dot and know the real reason for the issue.
> You can type: man glob<p>Translation: you cannot type "man" followed by an asterisk because that would have required forethought in how one learns a programming language.<p>Argle: Hey Bargle, is that new bridge built to spec?<p>Bargle: It's like I always say, man: <i>good enough for shell script manual operator discoverability</i>.<p>Argle: Yeah, you're always saying that...
I almost appreciate the idea of writing a C program whose sole purpose is to show you the arguments send to it. That's some serious overkill.<p>But I think echo *.py would not just be easier, but more effective at demonstrating what your find command line will actually look like after shell expansion.
Another one I've seen a few times with find (although it is actually more an xargs thing than a find thing), usually not with any bad consequences at least, is something like this:<p>find . -type f | xargs grep foo<p>You expect to see all the "foo" lines from your files, each prefixed with the file name and a colon. And that's what you get most of the time.<p>But sometimes you might get a foo line without the filename prefix.<p>Why? Because grep only adds the filename prefix when there is more than one filename argument. There are two ways that might come about in the above command.<p>The first is if the find only finds one file.<p>The second is if the find finds so many files that xargs has to invoke grep more than once. It can happen that there is only one file left to do when it gets to the final grep invocation.<p>Simple fix:<p>find . -type f | xargs grep foo /dev/null
Do people really use find like that to clean up source code?<p><pre><code> git clean -fxdn # review what would be deleted, then
git clean -fxd</code></pre>
Globbing is also more complex than it seems on first blush.<p>Alternation is supported in bash. Stuff like <i>"echo ∗.{png,jp{e,}g}"</i> (utf8 asterisk to get around HN, cut/paste won't work).<p>That bsd derived glob is useful for some sometimes useful tricks. Like in Perl:<p>use File::Glob qw/bsd_glob/;<p>my @list = bsd_glob('This {list|stuff} is nested {{very,quite} deeply,deep}');
Apparently, the OP read the man page of "<i>glob</i>" but couldn't spend a single minute reading the man page of <i>FIND</i>.<p><pre><code> -name pattern
... Don't forget to enclose the pattern in quotes in order to protect it from expansion by the shell.
</code></pre>
So, a lengthy article on an already documented feature of <i>find</i>.
One of my favorite features of fish shell is the <i></i> wildcard expansion:<p><a href="https://fishshell.com/docs/current/tutorial.html#tut_wildcards" rel="nofollow">https://fishshell.com/docs/current/tutorial.html#tut_wildcar...</a><p>It's rare I reach for find nowadays since switching.
Also, a good thing to remember is to mind the order of your options.
I know from experience that it matters a lot after thinking once: "I can just place this `-delete` option wherever, right?" and using it as my first option.
Needless to say, I had a very bad time.
It is unexpected to new users, principle of lead surprise should show it's not a great idea. At the same time, globbing is a basic concept for beginning shell users. I don't know how I learned about it, if someone told me or something.
TL;DR: globbing kicks in before other things unless you turn it off like with single quotes:<p><a href="https://www.tldp.org/LDP/abs/html/globbingref.html" rel="nofollow">https://www.tldp.org/LDP/abs/html/globbingref.html</a><p>Don't forget about globbing when making Bash scripts.
at my workplace I use git bash on windows, and because i'm always getting
man: command not found
I flick to a browser and type
man <whatever I was looking for><p>About once a year I forget what happened all the previous times I typed<p><pre><code> man find
</code></pre>
into google.<p>TLDR: looking for Dennis Ritchie, I found Chuck Tingle.
<p><pre><code> find . -name *.jpg
</code></pre>
TL;DR if you don’t wrap your wildcard expressions in quotes the shell will expand them.<p><pre><code> find . -name ‘*.jpg’
</code></pre>
so wrap them in single quotes so that they make their way to the find command unexpanded.
This is the <i>one</i> thing I like about DOS/cmd.exe. In those shells, wildcard expansion is done by applications instead of the shell itself, so there's no need to resort to hacks like this.