It seems to me that most startups use one of the LLMs from OpenAI (ChatGPT / GPT-4), other closed-source LLMs like Bard or Claude are not that popular and open-source LLMs are only really used by researchers or hobbyists who run them locally.<p>Curious to hear from others who are building something on top of an LLM, which one do you use? Has anyone fine-tuned and deployed their own LLM or do you just rely on ChatGPT?
I am experimenting with building software using the ReAct tool prompting pattern, using Llama derivative models like Manticore13B, Airoborous, etc. I script it all together using Microsoft Guidance with Llama.cpp and AutoGPTQ. Works pretty well for simple tasks and I know the costs are roughly fixed. Obviously their capabilities are far less than OpenAI's products but when you have tens of thousands of conversations to have the costs of ChatGPT become a distraction. Haven't tried finetuning yet.
I recently started using the ChatGPT via OpenAI api. It seems to perform reasonably well on all tasks I needed it for so far. But bear in mind I use it for hobby projects so far.<p>I have not tried the open source LLMs so far as there's the additional hassle renting a server and deploying it there. Running it locally on a consumer GPU does not cut it yet as it is too slow. So to iterate faster, I prefer just using ChatGPT.
We are using both openAI and Google's Bard.
There are some aspects to each of them - but it's still mostly 'WIP'. The main challenge is how to 'tune' these LLMs with our own data as another layer that can improve the overall performance (=quality of answers).
For a side project I'm using a fine-tuned vicuna-13b. I'm using to generate search queries from natural language, and it outperforms all other open-source models at deep intent recognition.