Okay, I actually got local co-pilot set up. You will need these 4 things.<p>1) CodeLlama 13B or another FIM model <a href="https://huggingface.co/codellama/CodeLlama-13b-hf" rel="nofollow noreferrer">https://huggingface.co/codellama/CodeLlama-13b-hf</a>. You want "Fill in Middle" models because you're looking at context on both sides of your cursor.<p>2) HuggingFace llm-ls <a href="https://github.com/huggingface/llm-ls">https://github.com/huggingface/llm-ls</a> A large language mode Language Server (is this making sense yet)<p>3) HuggingFace inference framework. <a href="https://github.com/huggingface/text-generation-inference">https://github.com/huggingface/text-generation-inference</a> At least when I tested you couldn't use something like llama.cpp or exllama with the llm-ls, so you need to break out the heavy duty badboy HuggingFace inference server. Just config and run. Now config and run llm-ls.<p>4) Okay, I mean you need an editor. I just tried nvim, and this was a few weeks ago, so there may be better support. My expereicen was that is was full honest to god copilot. The CodeLlama models are known to be quite good for its size. The FIM part is great. Boilerplace works so much easier with the surrounding context. I'd like to see more models released that can work this way.
FWIW: you can use any other proxy server for this to any openai compatible api server.<p>e.g. with mitmproxy and llama-cpp-python server<p><pre><code> python -m llama_cpp.server --n_ctx 4096 --n_gpu_layers 1 --model ./path/to/..gguf
</code></pre>
and then with mitmproxy in another terminal<p><pre><code> mitmproxy -p 5001 --mode reverse:http://127.0.0.1:8000
</code></pre>
and then set this in your vscode settings.json (the same as for localpilot):<p><pre><code> "github.copilot.advanced": {
"debug.testOverrideProxyUrl": "http://localhost:5001",
"debug.overrideProxyUrl": "http://localhost:5001"
}
</code></pre>
works way better for me than localpilot
I guess similar to ollama (recently discussed: <a href="https://news.ycombinator.com/item?id=36802582">https://news.ycombinator.com/item?id=36802582</a>) which also has support for code-focused models (see: <a href="https://ollama.ai/library">https://ollama.ai/library</a>).<p>I tried pretty much all of them with Continue in VSCode, and it's a bit hit and miss, but the main difference is the way the workflows work (Copilot is mostly line completion, Continue is mostly chat or patches). So the main value add here for me would be a more Copilot-like workflow (which seems to align better with the day-to-day experience I has so far).
I found the fact Copilot is close sourced -- not just the model, but even the plugins are close sourced -- is very worrying. Good to see efforts on the alternatives.
Looks cool! Always like to see these local alternatives. I'm a Sublime Text user (it is still amazing!) so there aren't many options for LLM assistants. The only one I found that works for me on Sublime is <a href="https://codeium.com/" rel="nofollow noreferrer">https://codeium.com/</a> and it is also free for the basic usage.<p>They have a great list of supported editors:<p>- Android Studio
- Chrome (Colab, Jupyter, Databricks and Deepnote, JSFiddle, Codepen, Codeshare, and StackBlitz)
- CLion
- Databricks
- Deepnote
- Eclipse
- Emacs
- GoLand
- Google Colab
- IntelliJ
- JetBrains
- Jupyter Notebook
- Neovim
- PhpStorm
- PyCharm
- Sublime Text
- Vim
- Visual Studio
- Visual Studio Code
- WebStorm
- Xcode<p>I have found that the completions are decent enough. I do find that sometimes the completion suggestions are too aggressive and try to complete more than I want so I end up leaving it off until I feel like I could use it.
Hm… the q4 34B code llama (which is used here) performs quite poorly in my experience.<p>Using a high quantised larger model gives you an unrealistic impression that smaller models and larger models are roughly equivalently capable… but it’s a trade off. The larger codellama model is <i>categorically</i> better, if you don’t lobotomise it.<p>It’d be better if instead of making opinionated choices (which aren’t great) it guided you on how to select an appropriate model…
Why does it seem like a lot of the local AI tools target mac specifically? Why don't AI developers seem to be able to write cross platform software?
I would love to be able to take a base model and fine-tune it on a handful of hand picked repositories that are A) in a specific language I want to use and B) stylistically similar to how I want to write code.<p>I’m not sure how possible that is to do, but I hope we can get there at some point.