I'm sure this, and other LLM/IDE integration has it's uses, but I'm failing to see how it's really any kind of major productivity boost for normal coding.<p>I believe average stats for programmer productivity of production-quality, debugged and maybe reusable code are pretty low - around 100 LOC/day, although it's easy to hit 1000 LOC/day or more when building throwaway prototypes/etc.<p>The difference between productivity in terms of production quality code and hacking/prototyping is because of the quality aspect, and for most competent/decent programmers coding something themselves is going to produce better quality code, that they understand, than copying something from substack or an LLM. The amount of time it'd take to analyze the copied code for correctness, lack of vulnerabilities, or even just decent design for future maintainability (much more of a factor in terms of total lifetime software cost than writing the code in the first place) would seem to swamp any time gained in not having to write the code yourself (which is basically the easiest and least time consuming part of any non-trivial software project).<p>I can see the use of LLMs in some learning scenarios, or for cases when writing throwaway code where quality is unimportant, but for production code I think we're still a long way from the point where the output of an LLM is going to be developer-level and doesn't need to be scrutinized/corrected to such a degree that the speed benefit of using it is completely lost!
Just what I've been looking for!<p>Thanks for pushing the tooling of self-hosted LLMs forward, Justine. Llamafiles specifically should become a standard.<p>Would there be a way of connecting to a remote LLM that's hosted on the same LAN, but not on the same machine? I don't use Apple devices, but do have a capable machine on my network for this purpose. This would also allow working from less powerful devices.<p>Maybe the Llamafile could expose an API? This steps into LSP territory, and while there is such a project[1], leveraging Llamafiles would be great.<p>[1]: <a href="https://github.com/huggingface/llm-ls">https://github.com/huggingface/llm-ls</a>
I'm running a MacBook Pro M1 Max with 64GB RAM and I downloaded the 34B Q55 model (the large one) and can confirm it works nicely. It's slow, but usable. Note I am running it on my Asahi Fedora Linux partition, so I do not know if or how it is utilizing the GPU. (Asahi has OpenGL support but not Metal.)<p>My environment is configured with ZSH 5.9. If I invoke the LLM directly as root (via SUDO,) it loads up quickly into a web server and I can interact with it via a web-browser pointed to localhost:8080.<p>However, when I try to run the LLM from Emacs (after loading the LISP script via M-x ev-b,) I get a "Doing vfork: Exec format error." This is when trying to follow the demo in the Readme by typing C-c C-k after I type the beginning of the isPrime function.<p>Any ideas as to what's going wrong?
Unrelated to the plugin but wow the is_prime function in the video demonstration is awful. Even if the input is not divisible by 2, it'll still check it modulo 4, 6, 8, ... which is completely useless. It could be made literally 2x faster by adding a single line of code (a parity check), and then making the loop over odd numbers only. I hope you people using these LLMs are reviewing the code you get before pushing to prod.
This is great for what it does, but I want a more generic LLM integration that can do this and everything else LLMs do.<p>For example, one key stroke could be "complete this code", but other keystrokes could be:<p>- send current buffer to LLM as-is<p>- send region to LLM<p>- send region to LLM, and replace with result<p>I guess there are a few orthogonal features. Getting input into LLM various ways (region, buffer, file, inline prompt), and then outputting the result various ways (append at point, overwrite region, put in new buffer, etc). And then you can build on top of it various automatic system prompts like code completion, prose, etc.
Super interesting and I will try it out for sure!<p>But: The mode of operation is quite different from how GitHub CoPilot works, so maybe the name is not very well chosen.<p>It's somewhat surprising that there isn't more development happening in integrating Large Language Models with Emacs. Given its architecture etc., Emacs appears to be an ideal platform for such integration. But most projects haven't been worked on for months etc. But maybe the crowd that uses Emacs is mostly also the crowd that would be against utilizing LLMs ?
For vim, I use a custom command which takes the currently selected code and opens a browser window like this:<p><a href="https://www.gnod.com/search/ai#q=Can%20this%20Python%20function%20be%20improved%3F%0A%0Adef%20sum_of_squares(n)%3A%0A%20%20%20%20result%20%3D%200%0A%20%20%20%20for%20i%20in%20range(1%2C%20n%2B1)%3A%0A%20%20%20%20%20%20%20%20result%20%2B%3D%20i**2%0A%20%20%20%20return%20result" rel="nofollow">https://www.gnod.com/search/ai#q=Can%20this%20Python%20funct...</a><p>So I can comfortably ask different AI engines to improve it.<p>The command I use in my vimrc:<p><pre><code> command! -range AskAI '<,'>y|call system('chromium gnod.com/search/ai#q='.substitute(iconv(@*, 'latin1', 'utf-8'),'[^A-Za-z0-9_.~-]','\="%".printf("%02X",char2nr(submatch(0)))','g'))
</code></pre>
So my workflow when I have a question about some part of my code is to highlight it, hit the : key, that will put :'<,'> on the command line, then I type AskAI<enter>.<p>All a matter of a second as it already is in my muscle memory.
This is quite intriguing, mostly because of the author.<p>I don't understand very well how llamafiles work, so it looks a little suspicious to just call it every time you want completion (model loading etc), but I'm sure this is somehow covered withing the llamafile's system. I wonder about the latency and whether it would be much impacted if a network call has been introduced such that you can use a model hosted elsewhere. Say a team uses a bunch of models for development, shares them in a private cluster and uses them for code completion without the necessity of leaking any code to openai etc.
Does anyone else get "Doing vfork: Exec format error"?
Final gen. Intel Mac, 32 GB memory. I can run the llamafile from a shell. Tried both wizardcoder-python-13b and phi
I use Emacs for most of my work related to coding and technical writing.
I've been running phind-v2-codellama and openhermes using ollama and gptel, as well as github's copilot. I like how you can send an arbitrary region to an LLM and ask for things about it. Of course the UX is in early stage, but just imagine if a foundation model can take all the context (i.e. your orgmode files and open file buffers) and can use tools like LSP.
> You need a computer like a Mac Studio M2 Ultra in order to use it. If you have a mere Macbook Pro, then try the Q3 version.<p>The intersection between people who use emacs for coding, and those who own a mac studio ultra must be miniscule.<p>Intel MKL + some minor tweaking gets you really excellent LLM performance on a standard PC, and that's without using the GPU.
What is the upgrade path for a Llamafile? Based on my quick reading and fuzzy understanding, it smushes llama.cpp (smallish, updated frequently) and the model weights (large, updated infrequently) into a single thing. Is it expected that I will need to re-download multiple gigabytes of unchanged models when there's a fix to llama.cpp that I wish to have?
Also worth checking out for more general use of LLMs in emacs: <a href="https://github.com/karthink/gptel">https://github.com/karthink/gptel</a>
How does one get this recommended WizardCoder-Python-13b llamafile? Searching turns up many results from many websites. Further, it appears that the llamafile is a specific type that somehow encapsulates the model and the code used to interface with it.<p>Is it the one listed here? <a href="https://github.com/Mozilla-Ocho/llamafile">https://github.com/Mozilla-Ocho/llamafile</a>
<p><pre><code> ;;; copilot.el --- Emacs Copilot
;; The `copilot-complete' function demonstrates that ~100 lines of LISP
;; is all it takes for Emacs to do that thing Github Copilot and VSCode
;; are famous for doing except superior w.r.t. both quality and freedom
</code></pre>
> ~100 lines<p>I wonder if emacs-copilot could extend itself, or even bootstrap itself from fewer lines of code.
How well does Copilot work for refactoring?<p>Say I have a large Python function and I want to move a part of it to a new function. Can Copilot do that, and make sure that all the referenced local variables from the outer function are passed as parameters, and all the changed variables are passed back through e.g. return values?
Excellent work—thanks!<p>Have you perhaps thought about the possibility of an extension that could allow an Emacs user collect data to be used on a different machine/cluster for human finetuning?
It's going to be like self driving cars all over again.<p>Tech people said it will never happen, because even if the car is 10x safer than a normal driver, if it's not almost perfect people will never trust it. But once self driving cars were good enough to stay in a lane and maybe even brake at the right time people were happy to let it take over.<p>Remember how well sandboxed we thought we'd make anything
even close to a real AI just in case it decides to take over the world? Now we're letting it drive emacs. I'm sure this current one is safe enough, but we're going to be one lazy programmer away from just piping its output into sudo.
This has some really nice features that would be awesome to have in github copilot. Namely streaming tokens, customizing the system prompt, and pointing to a local LLM.
Note that this isn't for github's copilot, but rather for running your own LLM engine locally. It's going to quickly get confused with the unofficial copilot-for-emacs plugin pretty quickly: <a href="https://github.com/zerolfx/copilot.el">https://github.com/zerolfx/copilot.el</a>