Ok, since is running all private, how can I add my own private data? For example I have a 20+ years of an email archive that I'd like to be ingested.
Very cool, this looks like a combination of chatbot-ui and llama-cpp-python? A similar project I've been using is <a href="https://github.com/serge-chat/serge">https://github.com/serge-chat/serge</a>. Nous-Hermes-Llama2-13b is my daily driver and scores high on coding evaluations (<a href="https://huggingface.co/spaces/mike-ravkine/can-ai-code-results" rel="nofollow noreferrer">https://huggingface.co/spaces/mike-ravkine/can-ai-code-resul...</a>).
Nice project! I could not find the information in the README.md, can I run this with a GPU? If so what do I need to change? Seems like it's hardcoded to 0 in the run script: <a href="https://github.com/getumbrel/llama-gpt/blob/master/api/run.sh#L12">https://github.com/getumbrel/llama-gpt/blob/master/api/run.s...</a>
I didn't see any info on how this is different than installing/running llamacpp or koboldcpp. New offerings are awesome of course, but what is it adding?
What is the advantage of this versus running something like <a href="https://github.com/simonw/llm">https://github.com/simonw/llm</a> , which also gives you options to e.g. use <a href="https://github.com/simonw/llm-mlc">https://github.com/simonw/llm-mlc</a> for accelerated inference?
So many projects still using GPT in their name.<p>Is the thinking here that OpenAI is not going to defend that trademark? Or just kicking the can down the road on rebranding until the C&D letter arrives?
(1) What are the best more creative/less lobotomized versions of Llama 2?
(2) What's the best way to get one of those running in a similarly easy way?