TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Dump entire Git repos into a single file for LLM prompts

57 pointsby artkulak8 months ago
Hey! I wanted to share a tool I&#x27;ve been working on. It&#x27;s still very early and a work in progress, but I&#x27;ve found it incredibly helpful when working with Claude and OpenAI&#x27;s models.<p>What it does: I created a Python script that dumps your entire Git repository into a single file. This makes it much easier to use with Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems.<p>Key Features: - Respects .gitignore patterns - Generates a tree-like directory structure - Includes file contents for all non-excluded files - Customizable file type filtering<p>Why I find it useful for LLM&#x2F;RAG: - Full Context: It gives LLMs a complete picture of my project structure and implementation details. - RAG-Ready: The dumped content serves as a great knowledge base for retrieval-augmented generation. - Better Code Suggestions: LLMs seem to understand my project better and provide more accurate suggestions. - Debugging Aid: When I ask for help with bugs, I can provide the full context easily.<p>How to use it: Example: python dump.py &#x2F;path&#x2F;to&#x2F;your&#x2F;repo output.txt .gitignore py js tsx<p>Again, it&#x27;s still a work in progress, but I&#x27;ve found it really helpful in my workflow with AI coding assistants (Claude&#x2F;Openai). I&#x27;d love to hear your thoughts, suggestions, or if anyone else finds this useful!<p><a href="https:&#x2F;&#x2F;github.com&#x2F;artkulak&#x2F;repo2file">https:&#x2F;&#x2F;github.com&#x2F;artkulak&#x2F;repo2file</a><p>P.S. If anyone wants to contribute or has ideas for improvement, I&#x27;m all ears!

16 comments

subeadia8 months ago
These are extremely common these days. Here are a few I&#x27;ve collected over the past few months:<p>- [files-to-prompt](<a href="https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;files-to-prompt">https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;files-to-prompt</a>) (from the GOAT simonw)<p>- [code2prompt](<a href="https:&#x2F;&#x2F;github.com&#x2F;mufeedvh&#x2F;code2prompt">https:&#x2F;&#x2F;github.com&#x2F;mufeedvh&#x2F;code2prompt</a>)<p>- <a href="https:&#x2F;&#x2F;gh-repo-dl.cottonash.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;gh-repo-dl.cottonash.com&#x2F;</a><p>- [1filellm](<a href="https:&#x2F;&#x2F;github.com&#x2F;jimmc414&#x2F;1filellm">https:&#x2F;&#x2F;github.com&#x2F;jimmc414&#x2F;1filellm</a>)<p>- [repopack](<a href="https:&#x2F;&#x2F;github.com&#x2F;yamadashy&#x2F;repopack">https:&#x2F;&#x2F;github.com&#x2F;yamadashy&#x2F;repopack</a>)<p>- [ingest](<a href="https:&#x2F;&#x2F;github.com&#x2F;sammcj&#x2F;ingest">https:&#x2F;&#x2F;github.com&#x2F;sammcj&#x2F;ingest</a>)<p>What makes yours better?
评论 #41496639 未加载
评论 #41488949 未加载
评论 #41494008 未加载
评论 #41494699 未加载
trees1018 months ago
Take a look at what aider does to create a repo map using treesitter; <a href="https:&#x2F;&#x2F;aider.chat&#x2F;docs&#x2F;repomap.html" rel="nofollow">https:&#x2F;&#x2F;aider.chat&#x2F;docs&#x2F;repomap.html</a> <a href="https:&#x2F;&#x2F;aider.chat&#x2F;2023&#x2F;10&#x2F;22&#x2F;repomap.html" rel="nofollow">https:&#x2F;&#x2F;aider.chat&#x2F;2023&#x2F;10&#x2F;22&#x2F;repomap.html</a><p>I guess the difference is that your script produces a complete copy, whereas aider uses a concise summary, necessary for when the context window is full
smcleod8 months ago
This is a similar tool I wrote for myself called &quot;ingest&quot;. It ingests files&#x2F;directories to LLM friendly markdown, estimates token usage, and can estimate vRAM usage for different models and quantisations and shows you a table highlighting which quantisation, context size and k&#x2F;v cache quantisation will fit in a given (v)RAM size. - <a href="https:&#x2F;&#x2F;github.com&#x2F;sammcj&#x2F;ingest">https:&#x2F;&#x2F;github.com&#x2F;sammcj&#x2F;ingest</a>
some_rand_guy08 months ago
Thats cool. I&#x27;ve used it. I&#x27;d add:<p>- treat &#x27;-&#x27; as stdout<p>- named arguments<p>- dont filter ignorefiles by checking they start with &#x27;.&#x27;, cause it makes local .gitignore not being found, and treated as an extension :)
brumar8 months ago
I schemed the readme, but did not see support for prefixing each line with line numbers, this is an absolute must have for people like me who have a workflow centered around generating git patchs. In my experience that gives generated patchs much more chances to be incorrect.
llagerlof8 months ago
Nice. I have a few suggestions:<p>Put code blocks inside 3 ticks in the beginning and 3 ticks in the end since it&#x27;s the default for each file.<p>Remove the dashes to save tokens.<p>In the title for the code blocks put the full relative path to the file since some projects have many files with the same name.
评论 #41486898 未加载
vvoruganti8 months ago
Made a similar one that&#x27;s not super polished - <a href="https:&#x2F;&#x2F;github.com&#x2F;VVoruganti&#x2F;repo-to-prompt">https:&#x2F;&#x2F;github.com&#x2F;VVoruganti&#x2F;repo-to-prompt</a>
breck8 months ago
Interesting! There was another Show HN that did this same thing earlier in the day!<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=41480373">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=41480373</a>
mistermann8 months ago
Something like this that could automatically scrape a set of url&#x27;s into a file would also be useful for trying to learn how to use various terrible enterprise software applications (SAP).
_andrei_8 months ago
made one as well with interactive selection and token counting <a href="https:&#x2F;&#x2F;github.com&#x2F;3rd&#x2F;promptpack">https:&#x2F;&#x2F;github.com&#x2F;3rd&#x2F;promptpack</a>
vnjxk8 months ago
There is an api for this at <a href="https:&#x2F;&#x2F;txtrepo.com" rel="nofollow">https:&#x2F;&#x2F;txtrepo.com</a> I used it with n8n to create PRs on issues
评论 #41492363 未加载
johnisgood8 months ago
How does this (or similar tools) differ from just a simple `cat foo bar &gt; out`?
rnapoles8 months ago
Great, I didn&#x27;t know about this type of tools, thanks
ndr_8 months ago
Another approach is to just tar up the files, without compression. Works well with Claude via API.
atxtechbro8 months ago
Seems like a common itch to scratch and a good tool to scratch it with. I created &#x27;linusfiles&#x27; and &#x27;grabout&#x27; as tools with this. Grabout copies the last input and error message or other output to clipboard and linusfiles copies the tracked files to clipboard.<p>But I like the idea of tarballing it, as ndr_ suggested. I&#x27;m thinking that could be the move here.<p>In case anyone wanted to see my workflows <a href="https:&#x2F;&#x2F;github.com&#x2F;atxtechbro&#x2F;shell-tooling">https:&#x2F;&#x2F;github.com&#x2F;atxtechbro&#x2F;shell-tooling</a>
AyushK18 months ago
that&#x27;s a cool project.