Hey HN! We've just open-sourced Codegen (<a href="https://github.com/codegen-sh/codegen-sdk">https://github.com/codegen-sh/codegen-sdk</a>), a Python library for manipulating Python + JS/React codebases.<p>Codegen was engineered backwards from real-world, large-scale codebase analysis + refactors we performed on multi-million-line enterprise codebases. It provides a scriptable interface to a powerful, multi-lingual language server built on Tree-sitter.<p>We realized that many code transformation tasks that impact large teams - refactors, enforcing patterns, analyzing control flow - are fundamentally programmatic operations. Yet existing tools like LibCST and Jscodeshift often require you to think in terms of ASTs and parser internals rather than the high-level changes you want to make.<p>Therefore, we built Codegen to match how developers actually think about code changes:<p><pre><code> # Move a symbol to a new file
# Handles imports, references, dependencies
function.move_to_file("new_file.py")
# Rename across the codebase
class_def.rename("NewName") # Updates all usages, preserves formatting
# Analyze call patterns
for usage in function.usages:
print(f"Used in {usage.file.name}")
</code></pre>
Codegen handles the edge cases automatically - updating imports, preserving dependencies, maintaining references, and resolving naming conflicts. You focus on intent, we handle the details.<p>Under the hood, Codegen performs static analysis to build a rich graph representation of your code. This enables:<p>- Versatile and comprehensive operations<p>- Built-in visualization capabilities<p>- Blazing fast execution of large-scale refactors<p>We've seen a wide variety of advanced code manipulation programs emerge, including:<p>- Mining codebases for LLM pre-training data<p>- Analyzing security vulnerabilities<p>- Large-scale API migrations<p>- Enforcing code patterns<p>We're excited to share this with the community and look forward to your feedback. Give it a spin and let us know what you think!<p><pre><code> uv tool install codegen
codegen notebook --demo
</code></pre>
Docs: <a href="https://docs.codegen.com" rel="nofollow">https://docs.codegen.com</a>
GitHub: <a href="https://github.com/codegen-sh/codegen-sdk">https://github.com/codegen-sh/codegen-sdk</a>
Community: <a href="https://community.codegen.com" rel="nofollow">https://community.codegen.com</a><p>Let us know if you have any questions or interesting use cases you'd like to explore.
Man, at first glance the documentation looks so good.<p>I’ve been meaning to build a PoC for directly manipulating symbols instead of text with the idea to eventually eliminate the possibility of syntax errors.<p>The task always looked one step too big for me to be worth it - the foundation for programmatically manipulating code seemed to be missing, maybe Roslyn fits the bill, but C# isn’t interesting to me ecosystem wise.<p>It seems like this may be what I was waiting for - pretty cool!