Is anyone using self hosted LLM day to day and training it like a new employee

100 点作者 reachableceo超过 1 年前

I have this idea to use LLM daily. Train it on my emails / notes / chats . Have it draft replies and I edit them as needed and it learns from that.Is anyone doing anything like that? I have all of the open source stuff downloaded (models , lollms-webui , promptfoo, etc ) and have been experimenting with the interactive chat stuff . Also txtai to make semantic search.That all seems pretty mature / progressing nicely . A few more months and I expect a clear reference stack will emerge .What about the assistant stack ? I invest all these resources to self host and feed in all my data. I want to maximize the ROI.

16 条评论

abstrct超过 1 年前

The most limiting factor I’ve come across is hitting the context window. Eventually your new eager employee starts to forget what you’ve taught them but they’re too confident to admit it.

评论 #38476943 未加载

评论 #38476450 未加载

评论 #38478162 未加载

评论 #38477079 未加载

评论 #38477170 未加载

评论 #38476469 未加载

评论 #38477644 未加载

评论 #38477968 未加载

TrevorJ超过 1 年前

I'm pretty interested in this as well. I have moved from Notion to Obsidian for my personal notes, to-do lists and errata in preparation for this since obsidian uses local plaintext files.What I would love to get working at some point is allowing an LLM access to my schedule, notes and goals and then have it help prompt me at appropriate times. "Hey, TJ I noticed you haven't worked out this week, it's sunny today this might be a good time". That sort of thing.There seem to be good tooling around agents, prompt engineering, RAG etc. The 'glue' around getting the LLM to help figure out when appropriate time(s) to check in with me is the bit I am missing, but that's probably mostly down to me being an artist and only a very very JR hobbyist programmer though.

评论 #38476687 未加载

评论 #38476911 未加载

评论 #38476632 未加载

评论 #38476615 未加载

评论 #38476664 未加载

评论 #38477714 未加载

valine超过 1 年前

Training a local LLM on individual facts is a tricky one. Typically it’s not possible to train with a limited quantity of data and expect the model to generalize on that data well. In context learning generalizes well, but it’s a bad fit for an “employee” model that’s supposed to learn over a long stretch of time.If your goal is to bake new concepts into the model weights, your only real option is a dataset with that concept being used in a wide variety of contexts.A more feasible approach I think would be retrial augmented generation. You’d essentially store your conversations in a database and calculate embeddings as you go. This would allow you to later do a natural language search of the database, and insert the most relevant portion of the conversation into your context window.

评论 #38477005 未加载

评论 #38478185 未加载

hathawsh超过 1 年前

I would like to extend the question: is anyone building a homelab for the specific purpose of training a LLM on their personal info? The choice of hardware (for speed, cost, and noise concerns) seems important.

评论 #38477342 未加载

评论 #38476909 未加载

评论 #38477113 未加载

xtracto超过 1 年前

I was thinking on doing something among the same lines but with code:An LLM that is specifically trained for Software Development, and to which I feed the code of all my company's repositories. And I keep feeding commits/pull requests.The idea is that I can query it about architectural issues, code improvements, and other technical aspects at different levels of abstraction (code, architecture, business, etc).So far, I've played a bit with CodeRabbit and it's "just ok" but it is more of a very small windows to what "could be" than being actually useful.

semireg超过 1 年前

As a solo developer answering emails that basically point people to various guides and FAQs I’ve published … I need this. Zendesk claims to have an AI component but forces you to input all training data into their own wiki knowledge base. I can see why they don’t want to use prior responses as training (pii concerns), but at least give me some boilerplate responses that I can use to get a head start and further train the model(s).

评论 #38476787 未加载

评论 #38476749 未加载

abdullin超过 1 年前

I've been building workflow assistants that make existing employees more productive or enable entirely new business models. Some of these assistants use selected local models (due to cost or privacy factors)Currently the stack is gravitates around:- GPT-4 - either to drive the entire workflow OR generate prompts, plans and guidelines for the local models to execute.- structured knowledge bases (either derived from existing sources OR curated manually by companies to drive AI assistants).- embedding search indexes, augmented by full-text search. Usually LLM has access to the search engine and can drive the search as needed, refining the queries if results aren't good enough.All of that is instrumented with logic to capture user feedback at every single step. This is crucial for the continuous improvement of the model!Bigger model can use this information once in a while, to improve plans and workflow guidelines to make the overall process more efficient.AMA, if needed!

评论 #38481420 未加载

willsmith72超过 1 年前

Everyone I know just uses the hosted ones, because of the sheer performance gap.For now, you can do all the custom/manual training you want, but gpt4 will almost always outperform it with the right context.Hopefully that will change in the future. Even then, I don't expect people to want to self-host as in on their own machines. More like custom training, then host either on SAAS or PAAS, or their own on-prem if they have it. Spending the performance of a personal laptop isn't worth the reduction of performance on other tasks. Again, maybe that will change.

评论 #38478209 未加载

dmezzetti超过 1 年前

Cool use case, glad to see txtai [1] is helping (I'm the main dev for txtai).Since you're using txtai, this article I just wrote yesterday might be helpful: <a href="https://neuml.hashnode.dev/build-rag-pipelines-with-txtai" rel="nofollow noreferrer">https://neuml.hashnode.dev/build-rag-pipelines-with-txtai</a>Looks like you've received a lot of great ideas here already though!1 - <a href="https://github.com/neuml/txtai">https://github.com/neuml/txtai</a>

评论 #38484797 未加载

MyFirstSass超过 1 年前

Local models have taken a mind boggling leap over the past months so i'm sure we'll be able to add layers soon by ourselves even on a laptop?Seriously this is not far from Chat GPT 3.5 in only 6.7GB's and runs on a Macbook Air:<a href="https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GGUF" rel="nofollow noreferrer">https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-GGUF</a>But yeah current context windows are limiting.

评论 #38477109 未加载

评论 #38477184 未加载

评论 #38476985 未加载

shanelleroman超过 1 年前

I've been using & contributing to Lightrail (<a href="https://github.com/lightrail-ai/lightrail">https://github.com/lightrail-ai/lightrail</a>). Each instance comes with a local vectorDB and integrates with apps like Chrome & VSCode, so I can read in content like my notes, emails, etc. It doesn't support self-hosted LLMs yet unfortunately!

gumboshoes超过 1 年前

There's Rewind.ai for macOS, which tracks all audio, video, and text it can see as you work, then lets you query it via its local LLM chat. Works pretty well. Also can summarize meetings and work with your calendar in certain ways. It does not use local documents you have not viewed on-screen; it doesn't index your file directories.

johntash超过 1 年前

I have a similar goal/desire as you. My current project is ingesting this type of data into Elasticsearch with vector embeddings and using a normal search+knn to generate some context when creating a prompt.This works reasonably well with gpt4, but my context is almost always too large for self-hosted models so far.

TheCaptain4815超过 1 年前

This is something I imagine coming out of Autogen or OpenAi Assistants in a few months. You really need multiple agents (as of now) most of the time. IMO multiple GPT4 agents ARE smart enough to accomplish a lot, it's getting them working together and setup that's the issue.

croes超过 1 年前

But you would lose the benefit of self hosting

ChrisArchitect超过 1 年前

Ask HN:

16 条评论

abstrct超过 1 年前

The most limiting factor I’ve come across is hitting the context window. Eventually your new eager employee starts to forget what you’ve taught them but they’re too confident to admit it.