Hey HN, I've built a cli, gpt-code-assistant, to help with exploring and understanding any codebase. This came about as I was starting to use gpt-4 in my day to day coding tasks. I would often copy paste code and ask gpt to help me understand code that I wasn’t used to.<p>So, instead of repeatedly switching from browser to VSCode and copy pasting, this tool indexes your code, and when asked a question, it uses vector embeddings to find the most relevant parts of your codebase to provide answers. Whether you're trying to understand unfamiliar code, generate documentation, or debug issues, this tool should be helpful.<p>Right now, it only supports OpenAI so you need a OpenAI API key to use the tool. Eventually, I can see this being replaced with a local open-source model to ensure that no code leaves your machine and removes the dependency on OpenAI.<p>This is an iteration of <a href="https://news.ycombinator.com/item?id=36521699">https://news.ycombinator.com/item?id=36521699</a><p>How does this work?<p>Primarily, this tool indexes your code, converts it into vector embeddings [1], and stores it locally in ChromaDB [2] for referencing whenever you ask a question. When you ask a question, it is then converted to a vector embedding and used to query the previously stored embeddings [3] to get the top 10 closest match. These matches are then used as context in the completion request to OpenAI [4] to generate an answer with the help of GPT-4.<p>Getting Started<p>You can start by installing it with pip install gpt-code-assistant. Then, create a project to index your code with gpt-code-assistant create-project <project-name> <path-to-codebase>. After that, you're ready to start asking questions: gpt-code-assistant query <project-name> "Your question here".<p>How can you best use this tool?<p>- Understanding unfamiliar code: gpt-code-assistant query <project-name> "What does the function [function name] do?"<p>- Documenting your code: gpt-code-assistant query <project-name> "Help me document the function [function name]"<p>- Generate code: gpt-code-assistant query <project-name> "Can you create a function that does X?"<p>- Debugging help: gpt-code-assistant query <project-name> "What might be causing this error [error]?"<p>- Testing: gpt-code-assistant query <project-name> "Can you help write a test for this function [function name]?"<p>Thanks to Spencer Miskoviak for helping build multiple iterations of this tool before it could be open-sourced!
I’d love for you to check out gpt-code-assistant and give me your feedback and thoughts.<p>Notes<p>[1] You can learn about vector embeddings here - <a href="https://platform.openai.com/docs/guides/embeddings/what-are-embeddings" rel="nofollow noreferrer">https://platform.openai.com/docs/guides/embeddings/what-are-...</a><p>[2] Likely the best vector DB right now for locally storing vector embeddings, which internally uses sqlite - <a href="https://docs.trychroma.com/" rel="nofollow noreferrer">https://docs.trychroma.com/</a>.<p>[3] ChromaDB makes querying super simple - <a href="https://docs.trychroma.com/usage-guide#querying-a-collection" rel="nofollow noreferrer">https://docs.trychroma.com/usage-guide#querying-a-collection</a>.<p>[4] This request takes in all the context along with the prompt to process it on OpenAI’s GPT and then returns the result - <a href="https://platform.openai.com/docs/guides/gpt/chat-completions-api" rel="nofollow noreferrer">https://platform.openai.com/docs/guides/gpt/chat-completions...</a>