TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Don’t underestimate grep-based code scanning

166 点作者 perch56将近 6 年前

14 条评论

secure将近 6 年前
One thing which was not immediately obvious to me for a while: the stricter your language’s formatting is, the easier it will be to grep source code.<p>I work a lot with Go, where all code in our repository is gofmt&#x27;ed. You can get quite far with regular expressions for finding&#x2F;analyzing Go code.<p>(And when regexps don’t cut it anymore, Go has excellent infrastructure for working with it programmatically. <a href="http:&#x2F;&#x2F;golang.org&#x2F;s&#x2F;types-tutorial" rel="nofollow">http:&#x2F;&#x2F;golang.org&#x2F;s&#x2F;types-tutorial</a> is a great introduction!)
评论 #20635837 未加载
评论 #20635246 未加载
评论 #20638007 未加载
评论 #20634270 未加载
fredley将近 6 年前
Don&#x27;t use grep. Use ag[0], which is specifically designed for searching code. It&#x27;s <i>much</i> faster, honors .gitignore, and the output can be piped back through grep if you like.<p><pre><code> ag FooBar | grep -v Baz </code></pre> It&#x27;s in brew&#x2F;apt&#x2F;yum etc as `the_silver_searcher` (although brew install ag works fine too).<p>0: <a href="https:&#x2F;&#x2F;github.com&#x2F;ggreer&#x2F;the_silver_searcher" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ggreer&#x2F;the_silver_searcher</a>
评论 #20635816 未加载
评论 #20635803 未加载
评论 #20635102 未加载
评论 #20636724 未加载
评论 #20636549 未加载
jimmaswell将近 6 年前
Still never going to beat AST-integrated searching like VS has for C#. Which has a regex search too.
评论 #20638209 未加载
评论 #20633520 未加载
jolmg将近 6 年前
The ability to easily grep for functions in C-like code is why I&#x27;ve come to appreciate projects defining their functions like:<p><pre><code> int foo_func(void) { </code></pre> You can grep for `^foo_func\b` to get to a declaration or definition, or `^foo_func\b.* {$` to get to a definition or `^foo_func\b.* ;` to get to a declaration. This is instead of using something like `^\w.* \bfoo_func\(`, which is what you&#x27;d need for:<p><pre><code> int foo_func(void) { </code></pre> By the way, anyone know of a way to insert a literal asterisk here without having to follow it up with a space?
评论 #20638064 未加载
评论 #20637237 未加载
timwaagh将近 6 年前
&gt; If not, the reviewer can quickly dismiss it as a false positive<p>This is were you could be wrong. We would need to give a reason for dismissing it and then the risk officer would need to approve it (or reject it). False positives can be a real pain in the ass.
tannhaeuser将近 6 年前
The post&#x27;s core message seems to be lost on HN. It&#x27;s about screening sources for supposedly insecure and&#x2F;or injection-prone funcs using simple text scanning (such as strcat, which however is considered in iOS apps when it is a C std API func); supposedly grepability is also about quickly finding code locations of messages and variables. But comments are all about Rust or Go superiority, irrelevant grep implementation details, and AST-based code analysis tools when these are specifically dismissed in TFA as producing too many false positives. Talk about bubbles and echo chambers.
评论 #20634919 未加载
评论 #20634794 未加载
KuhlMensch将近 6 年前
I do a few VERY SIMPLE greps. The most useful, is a pre-commit hook to check no blacklisted env vars exist in the commit diff. So, useful.<p>Grepping leans-in to shell. Though if you have other environments available (python, javascript etc), it makes sense to lean-into them e.g I use JavaScript examine my package.json to ensure my dependency SemVers&#x27; are &quot;exact&quot;.<p>That said, I rarely write static-analysis scripts: In JavaScript-world there is already a plethora of easily configurable linting &amp; type-checking tools. If I wanted to focus in on static-analysis etc I&#x27;d probably reach for <a href="https:&#x2F;&#x2F;danger.systems&#x2F;js&#x2F;" rel="nofollow">https:&#x2F;&#x2F;danger.systems&#x2F;js&#x2F;</a><p>SideNote: My CI generates a metrics.csv file, which serves as a &quot;metric catch-all&quot; for any script I might write e.g. grep to count &quot;&#x2F;&#x2F; TODO&quot; and &quot;test.skip&quot; strings, plus my JavasScript tests generate performance metrics (via monkey-patching React).<p>I don&#x27;t actually DO ANYTHING with these metrics, but I&#x27;m quite happy knowing the CI is chugging away at its little metric diary. One day I&#x27;ll plug it into something.
johnny-lee将近 6 年前
I&#x27;ve gone down this road years ago.<p>While there&#x27;s no install and initial results are quick to appear, the false positives that grep or any string search tool generates will make the cynics shoot down this simple attempt to find problems in the source code.<p>Problems that arose:<p>- what about use of those questionable APIs&#x2F;constants in strings (perhaps for logging) or in comments?<p>- some of the APIs listed in the article were only questionable when certain values were used - sometimes you can get grep&#x2F;search tool of choice to play along, but if the API call spans multiple lines or the constant has been assigned to a variable that is used instead, then a plain string search won&#x27;t help.<p>- it&#x27;s hard to ignore previously flagged but accepted uses of the API&#x2F;constants.<p>- so there&#x27;s a possible bug reported, but devs usually want to see the context of the problem (the code that contains the problem) quickly&#x2F;easily. Some text editors can grok the grep output and place the cursor at the particular line&#x2F;character with the problem, some can&#x27;t.<p>If you go down that road to try and reduce false positives, you&#x27;ll end up with a parser for your development language of choice.
评论 #20640391 未加载
anon1253将近 6 年前
I tend to work a lot in Lisp and XML, both are more or less trees if you squint (with the Lisp syntax famously being the AST due to homoiconicity) and it always makes me wonder if there are better command line tree search or tree diff algorithms out there (extra awesome if it works with git merge strategies). I mean whitespace preference is fine and all, but sometimes you just don’t care :p
alxmdev将近 6 年前
Hold on, strncat and strncpy are considered dangerous too, now? Not just the older versions without the <i>size_t num</i> argument?
评论 #20637615 未加载
评论 #20638573 未加载
parentheses将近 6 年前
You can search your codebase using livegrep [0] and get near instant results.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;livegrep&#x2F;livegrep" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;livegrep&#x2F;livegrep</a>
jpalomaki将近 6 年前
Random idea: maybe you could supercharge this by introducing to grep some constructs from programming languages. Now you have things like &quot;word character&quot;, &quot;whitespace&quot;, &quot;start of line&quot;. In supercharged version you would have &quot;function&quot;, &quot;identifier&quot;
评论 #20633719 未加载
评论 #20633805 未加载
评论 #20633708 未加载
评论 #20633740 未加载
switch007将近 6 年前
The imperative headline strikes again!
K0nserv将近 6 年前
Just a small note that I would highgly recommend ripgrep[0] over standard grep. It&#x27;s another modern tool that has been created by leveraging Rust and it&#x27;s from BurntSushi[1] who is excellent.<p>0: <a href="https:&#x2F;&#x2F;github.com&#x2F;BurntSushi&#x2F;ripgrep" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;BurntSushi&#x2F;ripgrep</a><p>1. <a href="https:&#x2F;&#x2F;github.com&#x2F;BurntSushi" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;BurntSushi</a>
评论 #20633143 未加载
评论 #20633136 未加载
评论 #20633619 未加载
评论 #20633455 未加载