科技回声

Hey HN, I wanted to share a simple command line tool I made that has sped up and simplified my LLM assisted coding workflow. Whenever possible, I’ve been trying to use Claude as a first pass when implementing new features / changes. But I found that depending on the type of change I was making, I was spending a lot of thought finding and deciding which source files should be included in the prompt. The need to copy/paste each file individually also becomes a mild annoyance.First, I implemented `repogather --all` , which unintelligently copies all sources files in your repository to the clipboard (delimited by their relative filepaths). To my surprise, for less complex repositories, this alone is often completely workable for Claude — much better than pasting in the just the few files you are looking to update. But I never would have done it if I had to copy/paste everything individually. 200k is quite a lot of tokens!But as soon as the repository grows to a certain complexity level (even if it is under the input token limit), I’ve found that Claude can get confused by different unrelated parts / concepts across the code. It performs much better if you make an attempt to exclude logic that is irrelevant to your current change. So I implemented `repogather "<query here>"` , e.g. `repogather "only files related to authentication"` . This uses gpt-4o-mini with structured outputs to provide a relevance score for each source file (with automatic exclusions for .gitignore patterns, tests, configuration, and other manual exclusions with `--exclude <pattern>` ).gpt-4o-mini is so cheap and fast, that for my ~8 dev startup’s repo, it takes under 5 seconds and costs 3-4 cents (with appropriate exclusions). Plus, you get to watch the output stream while you wait which always feels fun.The retrieval isn’t always perfect the first time — but it is fast, which allows you to see what files it returned, and iterate quickly on your command. I’ve found this to be much more satisfying than embedding-search based solutions I’ve used, which seem to fail in pretty opaque ways.<a href="https://github.com/gr-b/repogather">https://github.com/gr-b/repogather</a>Let me know if it is useful to you! Always love to talk about how to better integrate LLMs into coding workflows.

9 条评论

faangguyindia8 个月前

I usually only edit 1 function using LLM on old code base.On Greenfield projects. I ask Claude Soñnet to write all the function and their signature with return value etc..Then I've a script which sends these signature to Google Flash which writes all the functions for me.All this happens in paraellel.I've found if you limit the scope, Google Flash writes the best code and it's ultra fast and cheap.

评论 #41522309 未加载

评论 #41527075 未加载

mrtesthah8 个月前

This symbolic link broke it:srtp -> .<pre><code> File "repogather/file_filter.py", line 170, in process_directory if item.is_file(): ^^^^^^^^^^^^^^</code></pre> OSError: [Errno 62] Too many levels of symbolic links: 'submodules/externals/srtp/include/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp'

评论 #41530469 未加载

reacharavindh8 个月前

Do you literally paste a wall of text (source code of the filtered whole repo) into the prompt and ask the LLM to give you a diff patch as an answer to your question?Example,Here is my whole project, now implement user authentication with plain username/password?

评论 #41522720 未加载

评论 #41522148 未加载

评论 #41521979 未加载

reidbarber8 个月前

Nice! I built something similar, but in the browser with drag-and-drop at <a href="https://files2prompt.com" rel="nofollow">https://files2prompt.com</a>It doesn’t have all the fancy LLM integration though.

fellowniusmonk8 个月前

This looks very cool for complex queries!If your codebase is structured in a very modular way than this one liner mostly just works:find . -type f -exec echo {} \; -exec cat {} \; | pbcopy

评论 #41530525 未加载

smcleod8 个月前

There's so many of these popping up! Here's mine - <a href="https://github.com/sammcj/ingest">https://github.com/sammcj/ingest</a>

jondwillis8 个月前

In this thread: nobody using Cursor, embedding documentation, using various RAG techniques…

评论 #41530494 未加载

ukuina8 个月前

It's fascinating to see how different frameworks are dealing with the problem of populating context correctly. Aider, for example, asks users to manually add files to context. Claude Dev attempts to grep files based on LLM intent. And Continue.dev uses vector embeddings to find relevant chunks and files.I wonder if an increase in usable (not advertised) context tokens may obviate many of these approaches.

评论 #41522625 未加载

评论 #41523884 未加载

评论 #41522117 未加载

评论 #41522251 未加载

评论 #41522076 未加载

评论 #41521978 未加载

评论 #41524212 未加载

faangguyindia8 个月前

LLM for coding is bit meh after novelty wears off.I've had problems where LLM doesn't know which library version I am using. It keeps suggesting methods which do not exit etc...As if LLM are unaware of library version.Place where I found LLM to be most effect and effortless is CLIMy brother made this but I use it everyday <a href="https://github.com/zerocorebeta/Option-K">https://github.com/zerocorebeta/Option-K</a>

评论 #41522759 未加载

评论 #41523105 未加载

评论 #41524216 未加载

9 条评论

faangguyindia8 个月前

评论 #41522309 未加载

评论 #41527075 未加载

mrtesthah8 个月前

评论 #41530469 未加载

reacharavindh8 个月前

评论 #41522720 未加载

评论 #41522148 未加载

评论 #41521979 未加载

reidbarber8 个月前

fellowniusmonk8 个月前

This looks very cool for complex queries!If your codebase is structured in a very modular way than this one liner mostly just works:find . -type f -exec echo {} \; -exec cat {} \; | pbcopy

评论 #41530525 未加载

smcleod8 个月前

There's so many of these popping up! Here's mine - <a href="https://github.com/sammcj/ingest">https://github.com/sammcj/ingest</a>

jondwillis8 个月前

In this thread: nobody using Cursor, embedding documentation, using various RAG techniques…

Show HN: Repogather – copy relevant files to clipboard for LLM coding workflows

9 条评论

Show HN: Repogather – copy relevant files to clipboard for LLM coding workflows

9 条评论