I would like to have something as similar as possible to ChatGPT, but deployed in a closed environment. Feed it my specific data and query it afterwards. What would you recommend?
Keep in mind that the number of parameters on this thing goes to hundreds of billions. So, start with a local datacenter and continue from there. Facebook's Lama which is supposed to be both similar in performance and much reduced in size has only 65 billion params and weights around 220gb (as far as I remember from an article). You will need a beefy server to run this and it would better be shared or the server will be idle most of the time.
This might help you: <a href="https://www.hpc-ai.tech/blog/colossal-ai-chatgpt" rel="nofollow">https://www.hpc-ai.tech/blog/colossal-ai-chatgpt</a>
A web UI for running Large Language Models
<a href="https://github.com/oobabooga/text-generation-webui">https://github.com/oobabooga/text-generation-webui</a>