I mean in the sense of ML teams or departments creating custom models for their company, institution, etc. You can now do pretty much all that with a good LLM (fine-tuning it if needed), and it seems like ChatGPT Enterprise, due to its privacy and encryption features, is the final nail in the coffin (I imagine one of the strongest arguments against using ChatGPT was secrecy). And even if ChatGPT Enterprise isn't quite the final nail, LLMs are advancing so fast that you'll probably be able to run local models as good as GPT-4 or better in a year or two, which, I imagine, will really be the final nail. Or am I wrong about this?
Some languages will be hard to train for properly: eg Serbian has 7 declensions (nominative, dative, genitive...), gender and number embedded in noun forms, verbs also embed gender and number on top of tense, and this explosion in complexity makes it impossible to find the data covering all possible combinations.<p>Using ChatGPT in Serbian is like talking to a really bad... bot from the 90s. It mostly feels like two bad machine translation steps to and from English in between.<p>So, there is a whole NLP world out there that's badly covered by OpenAI, which means lots of options for advances in ML.<p>Not that I find there aren't plenty of improvements to be made to ChatGPT English models too.
<a href="https://en.m.wikipedia.org/wiki/Machine_learning" rel="nofollow noreferrer">https://en.m.wikipedia.org/wiki/Machine_learning</a><p>We’re keeping it a secret.
If you look at papers in arXiv you see there is<p>"zero shot" = ask an LLM to do it with a prompt<p>"few shot" = show a model (maybe an LLM) a few examples; LLMs perform well with "in context learning" which means giving a prompt AND showing some examples<p>"many shot" = train a model with many (typically 1000s) of examples.<p>The more training examples you have, the better results you get. A lot of people are seduced by ChatGPT because it promises fast results without a lot of hard work, rigorous thinking, and such, but you get back what you put in.<p>My RSS reader and agent YOShInOn uses<p><a href="https://sbert.net/" rel="nofollow noreferrer">https://sbert.net/</a><p>to transform documents into vectors and then I apply classical ML techniques such as the support vector machine, logistic regression, k-means clustering and such. I used to do the same things with bag-of-words model, BERT-like models give a significant boost to the accuracy, are simple to implement, and run quickly. I can write a script that tests 1000s of alternative models a day.<p>The main classification YOShInOn does is "will I like this content?" which is a rather fuzzy problem that won't retest perfectly. I tried applying a fine-tuned model to this problem and after a few days of trying different things I developed a procedure that took 30 minutes to make a model about as good as my classical ML model take took more like 30 seconds to train. If my problem wasn't so fuzzy I'd benefit more from the fine tuning and someday I might apply YOShInOn to make a training set for a better defined problem but I am delighted with the system I have now because it does things that I've dreamed of for 20 years.<p>The whole "prompting" model is dangerously seductive for various reasons but the low-down is that language is treacherous. This classic book<p><a href="https://www.amazon.com/G%C3%B6del-Escher-Bach-Eternal-Golden/dp/0465026567/ref=sr_1_1?hvadid=580741325414&hvdev=c&hvlocphy=9005411&hvnetw=g&hvqmt=e&hvrand=517620624109363776&hvtargid=kwd-131412542&hydadcr=3176_13534135&keywords=godel+escher+bach&qid=1693362858&sr=8-1" rel="nofollow noreferrer">https://www.amazon.com/G%C3%B6del-Escher-Bach-Eternal-Golden...</a><p>is not such an easy read but it contains some parables that explain why making a chatbot do what people would like a chatbot will be like endlessly pushing a bubble under the rug and these problems are not about the technology behind the chatbot but about the problem that they are trying to solve.
LLMs are still limited compared to highly customized models designed for specific use cases and datasets. There is still value in having skilled ML researchers and engineers who can build specialized models optimized for particular business needs. LLMs can be a powerful tool to enable faster iteration, but may not be a full replacement for custom modeling in many production environments. Factors like model accuracy, latency, explainability, and regulatory requirements may necessitate customized modeling.