TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Globstar – Open-source static analysis toolkit

103 点作者 sanketsaurav3 个月前
Hey HN! We’re Jai and Sanket, co-founders of DeepSource (YC W20). We&#x27;re open-sourcing Globstar (<a href="https:&#x2F;&#x2F;github.com&#x2F;DeepSourceCorp&#x2F;globstar">https:&#x2F;&#x2F;github.com&#x2F;DeepSourceCorp&#x2F;globstar</a>), a static analysis toolkit that lets you easily write and run custom code quality and security checkers in YAML [1] or Go [2].<p>After 5+ years of building AST-based static analyzers that process millions of lines of code daily at DeepSource, we kept hearing a common request from customers: &quot;How do we write custom checks specific to our codebase?&quot; AppSec and DevOps teams have a lot of learned anti-patterns and security rules they want to enforce across their orgs, and being able to do that without being a static analysis expert, came up as an important want.<p>We initially built an internal framework using tree-sitter [3] for our proprietary infrastructure-as-code analyzers, which enabled us to rapidly create new checkers. We realized that making the framework open-source could solve this problem for everyone.<p>Our key insight was that writing checkers isn&#x27;t the hard part anymore. Modern AI assistants like ChatGPT and Claude are excellent at generating tree-sitter queries with very high accuracy. We realized that the tree-sitters&#x27; gnarly s-expression syntax isn’t a problem anymore (since the AI will be doing all the generation anyway), and we can instead focus on building a fast, flexible, and reliable checker runtime around it.<p>So instead of creating yet another DSL, we use tree-sitter&#x27;s native query syntax. Yes, the expressions look more complex than simplified DSLs, but they give you direct access to your code&#x27;s actual AST structure – which means your rules work exactly as you&#x27;d expect them to. When you need to debug a rule, you&#x27;re working with the actual structure of your code, not an abstraction that might hide important details.<p>We&#x27;ve also designed Globstar to have a gradual learning curve: The YAML interface works well for simple checkers, and the Go Interface can handle complex scenarios when you need features like cross-file analysis, scope resolution, data flow analysis, and context awareness. The Go API gives you direct access to tree-sitter bindings, so you can write arbitrarily complex checkers on day one.<p>Key features:<p>- Written in Go with native tree-sitter bindings, distributed as a single binary<p>- MIT-licensed<p>- Write all your checkers in a “.globstar” folder in your repo, in YAML or Go, and just run “globstar check” without any build steps<p>- Multi-language support through tree-sitter (20+ languages today)<p>We have a long way to go and a very exciting roadmap for Globstar, and we’d love to hear your feedback!<p>[1] <a href="https:&#x2F;&#x2F;globstar.dev&#x2F;guides&#x2F;writing-yaml-checker" rel="nofollow">https:&#x2F;&#x2F;globstar.dev&#x2F;guides&#x2F;writing-yaml-checker</a><p>[2] <a href="https:&#x2F;&#x2F;globstar.dev&#x2F;guides&#x2F;writing-go-checker" rel="nofollow">https:&#x2F;&#x2F;globstar.dev&#x2F;guides&#x2F;writing-go-checker</a><p>[3] <a href="https:&#x2F;&#x2F;tree-sitter.github.io&#x2F;tree-sitter&#x2F;" rel="nofollow">https:&#x2F;&#x2F;tree-sitter.github.io&#x2F;tree-sitter&#x2F;</a>

8 条评论

markrian3 个月前
Interesting! Do you have a page which compares globstar against other similar tools, like Semgrep, ast-grep, Comby, etc?<p>For instance, something like <a href="https:&#x2F;&#x2F;ast-grep.github.io&#x2F;advanced&#x2F;tool-comparison.html#comparison-with-other-frameworks" rel="nofollow">https:&#x2F;&#x2F;ast-grep.github.io&#x2F;advanced&#x2F;tool-comparison.html#com...</a>.
评论 #43210742 未加载
xxpor3 个月前
Another rule engine checker that doesn&#x27;t support the language that needs this type of thing the most: C<p>In this case, it&#x27;s inexplicable to me since tree-sitter supports C fine.
评论 #43208954 未加载
评论 #43208767 未加载
micksmix3 个月前
One of the main benefits of Semgrep is its unified DSL that works across all supported languages. In contrast, using the Go module &quot;smacker&#x2F;go-tree-sitter&quot; can expose you to differences in s-expression outputs due to variations and changes in independent grammars.<p>I&#x27;ve seen grammars that are part of &quot;smacker&#x2F;go-tree-sitter&quot; change their syntax between versions, which can lead to broken S-expressions. Semgrep solves that with their DSL, because it&#x27;s also an abstraction away from those kind of grammar changes.<p>I&#x27;m a bit concerned that tree-sitter s-expressions can become &quot;write-only&quot; and rely on the reader to also understand the grammar for which they&#x27;ve been generated.<p>For example, here&#x27;s a semgrep rule for detecting a Jinja2 environment with autoescaping disabled:<p><pre><code> rules: - id: incorrect-autoescape-disabled patterns: - pattern: jinja2.Environment(... , autoescape=$VAL, ...) - pattern-not: jinja2.Environment(... , autoescape=True, ...) - pattern-not: jinja2.Environment(... , autoescape=jinja2.select_autoescape(...), ...) - focus-metavariable: $VAL </code></pre> Now, compare it to the corresponding tree-sitter S-expression (generated by o3-mini-high):<p><pre><code> ( call function: (attribute object: (identifier) @module (#eq? @module &quot;jinja2&quot;) attribute: (identifier) @func (#eq? @func &quot;Environment&quot;)) arguments: (argument_list (_)* (keyword_argument name: (identifier) @key (#eq? @key &quot;autoescape&quot;) value: (_) @val (#not-match @val &quot;^True$&quot;) (#not-match @val &quot;^jinja2\\.select_autoescape\\(&quot;)) (_)*) ) @incorrect_autoescape </code></pre> People can disagree, but I&#x27;m not sure that tree-sitter S-expressions as an upgrade over a DSL. I&#x27;m hoping I&#x27;m proven wrong ;-)
评论 #43214742 未加载
评论 #43214510 未加载
pdimitar3 个月前
Wow this looks great. I will be giving it a go <i>VerySoon™</i>!<p>Looking forward to writing some enhanced linters.
etyp3 个月前
I really love that static analyzers are pushing in this direction! I loved writing Clippy lints and I think applying that &quot;it&#x27;s just code&quot; with custom checks is a powerful idea. I worked on a static analysis product and the rules for that were horrible, I don&#x27;t blame the customers for not really wanting to write them.<p>Is there a general way to apply&#x2F;remove&#x2F;act on taint in Go checkers? I may not be digging deeply enough but it seems like the example just uses some `unsafeVars` map that is made with a magic `isUserInputSource` method. It&#x27;s hard for me to immediately tell what the capabilities there are, I bet I&#x27;m missing a bit.
评论 #43211024 未加载
评论 #43210055 未加载
评论 #43211102 未加载
micksmix3 个月前
This is a really interesting project!<p>I&#x27;d love to hear how this project differs from Bearer, which is also written in Go and based on tree-sitter? <a href="https:&#x2F;&#x2F;github.com&#x2F;Bearer&#x2F;bearer">https:&#x2F;&#x2F;github.com&#x2F;Bearer&#x2F;bearer</a><p>Regardless, considering there is a large existing open-source collection of Semgrep rules, is there a way they can be adapted or transpiled to tree-sitter S-expressions so that they may be reused with Globstar?
评论 #43214350 未加载
henning3 个月前
Is there a way to add a comment to disable the check rule similar to what you can do in ESLint to ignore a rule?
评论 #43208565 未加载
评论 #43208545 未加载
codepathfinder3 个月前
Nothing comes closer to CodeQL!<p>If anyone is interested please checkout, codepathfinder.dev, truly opensource CodeQL alternative.<p>Feedbacks are appreciated!
评论 #43210132 未加载