Short summary of the paper:<p>Take Gemma-2B. Take your API. Use ChatGPT-3.5 to generate 1,000 "correct" API function call responses by dint of placing only your API calls in the pre-prompt, then prompting it. I imagine they use ChatGPT to create the request language as well. Then make 1,000 "incorrect" API call responses by filling the pre-prompt with functions not from your API.<p>Finetune.<p>Note that they use "functional tokens" in training - they convert a function to a particular, previously unused tokenization, and refer to it that way. They claim this speeds up inference (I'm sure it does). They don't make any claims as to whether or not it changes their accuracy (I bet that it does). It definitely makes the system more fragile / harder to train for large and very large APIs.<p>Outcome: highly capable <i>single API</i> function call LLM. They say you could do it with as little as 100 training inputs if you really wanted.<p>I think this is interesting, but not world-shattering. I could imagine building a nice little service company on it, basically just "send us a git repo and you'll get a helpful function call API for this version of your code which you can hook up to an API endpoint / chatbot".<p>Limitations are going to be largely around Gemma-2B's skills -- A 2B model isn't super sophisticated. And you can see they specify "<30 tokens" for the prompt. But, I imagine this could be trained quickly enough that it could be part of a release CI process. There are a number of libraries I use that I would like to have access to such a model.<p>I'd be interested in something that has general knowledge of a large set of packages for a language, and could pull in / finetune / MoE little models for specific repositories I'm coding on. Right now I would rely on either a very large model and hope its knowledge cutoff is right (Claude/GPT-4), or using a lot of a large context window. There might be some Goldilocks version in the middle here which would be helpful in a larger codebase but be faster and more accurate than the cloud monopoly providers.