TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Repo2vec – an open-source library for chatting with any codebase

93 点作者 nutellalover9 个月前
Hi HN, We&#x27;re excited to share repo2vec: a simple-to-use, modular library enabling you to chat with any public or private codebase. It&#x27;s like Github Copilot but with the most up-to-date information about your repo.<p>We made this because sometimes you just want to learn how a codebase works and how to integrate it, without spending hours sifting through the code itself.<p>We tried to make it dead-simple to use. With two scripts, you can index and get a functional interface for your repo. Every generated response shows where in the code the context for the answer was pulled from.<p>We also made it plug-and-play where every component from the embeddings, to the vector store, to the LLM is completely customizable.<p>If you want to see a hosted version of the chat interface with its features, here&#x27;s a link: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=CNVzmqRXUCA" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=CNVzmqRXUCA</a><p>We would love your feedback!<p>- Mihail and Julia

12 条评论

resters9 个月前
Very useful! I was just thinking this kind of thing should exist!<p>I would also like to be able to have the LLM know all of the documentation for any dependencies in the same way.
评论 #41386191 未加载
评论 #41385264 未加载
cool-RR9 个月前
I want to feed it not only the code but also a corpus of questions and answers, e.g. from the discussions page on GitHub. Is that possible?
评论 #41385098 未加载
评论 #41384886 未加载
peterldowns9 个月前
Very cool project, I&#x27;m definitely going to try this out. One question — why use the OpenAI embeddings API instead of BGE (BERT) or other embeddings model that can be efficiently run client-side? Was there a quality difference or did you just default to using OpenAI embeddings?
评论 #41386149 未加载
评论 #41386132 未加载
zaptrem9 个月前
We have LLMs with hundreds of thousands of tokens context windows and prompt caching that makes using them affordable. Why don’t we just stuff the whole code base in the context window?
评论 #41386533 未加载
评论 #41386379 未加载
评论 #41386462 未加载
erichi9 个月前
Is it somehow different from Cursor codebase indexing&#x2F;chat? I’m using this setup to analyse repos currently.
评论 #41388295 未加载
adamtaylor_139 个月前
Sorry for the dumb question but can I use this on private repositories or is it sending my code to OpenAI?
评论 #41385481 未加载
评论 #41385542 未加载
kevshor9 个月前
This looks super cool! Is there currently a limit to how big a repo can be for this to work efficiently?
评论 #41386600 未加载
评论 #41386395 未加载
wiradikusuma9 个月前
Is this for a specific language? Does it support polygot (multiple languages in 1 project)?
评论 #41385302 未加载
interestingsoup9 个月前
Any plans on allowing the use of a local LLM like Ollama or LM Studio?
评论 #41386167 未加载
ccgongie9 个月前
Super easy to use! Thanks! What&#x27;s powering this under the hood?
评论 #41385258 未加载
RicoElectrico9 个月前
I wonder if it will work on <a href="https:&#x2F;&#x2F;github.com&#x2F;organicmaps&#x2F;organicmaps">https:&#x2F;&#x2F;github.com&#x2F;organicmaps&#x2F;organicmaps</a><p>So far two similar solutions I tested crapped out on non-ASCII characters. Because Python&#x27;s UTF-8 decoder is quite strict about it.
评论 #41386071 未加载
评论 #41385077 未加载
ranger_danger9 个月前
is there a docker image?