Text classifiers are an underrated application of LLMs

94 pointsby juungeover 1 year ago

16 comments

From my experience "single prompt classification" isn't as simple as "type in sentence and it works" in practice. But you can use some methods to massively improve it's consistency/output.I cannot recommend guidance enough. You can use shockingly small Llama models for some tasks with guidance while only actually generating a handful of tokens.You should highly consider some form of guidance/logit bias for classification especially if you have a known set of classes. This will ensure you get it in the format that you want, with the correct classes that you want.Keep in mind LLMs perform much better with COT. So you make it explain what the text/image is, then explain the possible classifications, then list its final decision. Again guidance can ensure it follows the correct format to do this.LLM's still massively benefit from finetuning, especially if you want too classify it in a particular format. Notebook tags vs SFW/NFSW vs important subjects, etc. Existing alignment can sometimes mess with some of these classifications too which finetuning helps smooth out.

评论 #37527643 未加载

评论 #37527136 未加载

kcorbittover 1 year ago

Yeah totally agree. We've found that a ton of OpenAI usage in practice is a variant of either classification or information extraction. This makes sense -- going from a human-native form of information (free text) to a computer-native form of information (structured data) is a key component of many pipelines!Of course, GPT-4 is insanely expensive to use at scale, and still isn't a perfect classifier. So the next step is to take the outputs you get from GPT-4 and use them to fine-tune a smaller model that's really fast and good at your specific problem. In my experience, even without using any human annotations or online learning, a model fine-tuned just on GPT-4 outputs can actually outperform GPT-4 as a classifier! This seems really counterintuitive at first, but my guess is what's happening is that the training process is a kind of regularization, so the weird mistakes GPT-4 occasionally makes are overwhelmed in the training data by all the times when GPT-4 gets it right.As a disclaimer, we're building open source tooling to ease the transition from prompt to cheaper fine-tuned model at my company OpenPipe.

评论 #37527728 未加载

kippinitrealover 1 year ago

Totally agree that this is under appreciated.Between open source modeling tools being incredible, transfer learning allowing dirt cheap fine-tuning and now mega-models being able to instantly give you a "mostly right" data set, the cost of creating ML features has dropped to almost nothing.Products that took quarters/years and required big budgets for labeling, ML specialist, GPUs etc just a few years ago can now be done in an hour or so for free (if you are scrappy). I imagine this is going to lead to a ton of great ML features that weren't worth funding in the past but are very valuable in aggregate. Similar to the mid-2000s when the cost/ease of web development came down enough that there was a lot more experimentation and fun to be had.

lamrogerover 1 year ago

Agreed! I think it's going to be hard for software developers to adjust to the "data science/engineering" mindset of monitoring and iterating on the long tail of maintenance. A lot of teams already have this issue with deterministic code running in production. I think there's a big opportunity to help purely software teams to learn and adjust.

jsightover 1 year ago

Honestly, this was one of the first things that excited me with chatgpt. I'm really eager to see a high performance inference engine that can keep up with my log data.Being able to teach an AI assistant to look for specific (but not too specific) things with just a prompt would be incredibly helpful.

100kover 1 year ago

Another thing you can do with LLMs that I think is pretty interesting is use them to train a cheaper and faster model. Then use the faster model in your application.

9devover 1 year ago

We’re doing this pretty successfully to identify products from massive text content, and even more importantly, we then perform a second pass and let the models categorise the identified products, and then do a third pass to build a category hierarchy. This gets us a full product taxonomy with practically no sweat. It’s amazing, really.

评论 #37528308 未加载

superb-owlover 1 year ago

I couldn't agree more. I talked a bit about how amazing and magical LLM-based text classification is here: <a href="https://blog.superb-owl.link/p/the-shapes-of-stories-with-chatgpt/" rel="nofollow noreferrer">https://blog.superb-owl.link/p/the-shapes-of-stories-with-ch...</a>

评论 #37528316 未加载

bxguffover 1 year ago

the LLM ouroboros starts with models being used to create training data for models.

评论 #37526931 未加载

评论 #37527673 未加载

particlesyover 1 year ago

You can use Particlesy to create a custom GPT-4 bot trained just to classify csv row data and integrate systems.One interesting use case we see is a SaaS company using the our REST API to access a Particle with custom instructions just for integration with other systems. They will provide a CSV row and the GPT-4 model will classify and map the columns into their key columns. In effect, they are able to integrate with almost any system in their vertical with an out-of-the-box integration. Albeit, it is more expensive, but it is great for the initial trial phase and the costs can be passed to the customer. <a href="https://www.particlesy.com" rel="nofollow noreferrer">https://www.particlesy.com</a>

vinay_ysover 1 year ago

I'm waiting for a well-optimized LLM-based system built-into a local editor like obsidian and I can ask it scan my entire local Documents folder and then it supercharge my reading/writing locally.

rckrdover 1 year ago

We've found the same. A lot of usage through our LLM Categorization endpoint. The toughest problem was actually constraining the model to only output valid categories and not hallucinate new ones. And to only return one for single-classification (or multiple if that's the mode).[0] <a href="https://matt-rickard.com/categorization-and-classification-with-llms" rel="nofollow noreferrer">https://matt-rickard.com/categorization-and-classification-w...</a>

brapover 1 year ago

With everyone talking about LLMs being glorified autocomplete, I actually would like to see how well they perform as autocomplete. Because most built-in ones are pretty bad.

评论 #37527659 未加载

评论 #37527730 未加载

评论 #37527681 未加载

magospietatoover 1 year ago

There is a software company in the UK that is performing medical-jargon-to-simple-English translation on text emitted from speech recognition.They achieve this via a fairly complex set of regular expressions, which must represent a significant time investment to research and maintain.The same effect can be achieved with a properly prompted GPT-4 completion request, which took about five minutes to write.

评论 #37527199 未加载

wodenokotoover 1 year ago

The nice thing about active learning with a classic ML model is, everytime you annotate a data point, the model learns.How do you update your prompt to take the new data point into account? Or do you just add it as an example inside the prompt and let it grow?

评论 #37528056 未加载

umutisikover 1 year ago

And very likely this is coming to computer vision too with multi-modal GPT.

评论 #37527006 未加载