TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What's the best framework for text classification (few-shot learning)?

6 pointsby backend-dev-33about 2 years ago
I am looking for software to classify documents into 10-20 categories. The documents are about half-screen to screen long.<p>There are some labeled data (about 50-80 labeled documents per category. not 500 per category), so a few-shot learning might be an option.<p>Algorithms used: it might be something like KNearestNeighbor or some ML&#x2F;Neural networks (transformers? LLM?). Should just do the proper classification.<p>Some restrictions: It should be a &quot;ready to use&quot; pipeline with documentation about training the model, parameter optimization etc. If possible - there should be some way to use this framework&#x2F;library without Python (I&#x27;m not a Python developer) For example, the [1] and [2] allow to use command-line interface for everything - it seems using Python is optional for these frameworks. The SetFit framework (see [3] and [4]) looks quite promising (good results with 8 labeled samples per class!). But requires doing everything in Python.<p>[1] https:&#x2F;&#x2F;fasttext.cc&#x2F;docs&#x2F;en&#x2F;supervised-tutorial.html<p>[2] https:&#x2F;&#x2F;neuml.github.io&#x2F;txtai&#x2F;pipeline&#x2F;text&#x2F;labels&#x2F;<p>[3] https:&#x2F;&#x2F;github.com&#x2F;huggingface&#x2F;setfit<p>[4] https:&#x2F;&#x2F;www.philschmid.de&#x2F;getting-started-setfit

1 comment

txtaiabout 2 years ago
SetFit is a great framework for building a text classifier.<p>This is a pretty straight forward problem and a good fit for a standard text classifier as well.<p>Here is an example of fine-tuning a model with txtai: <a href="https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;neuml&#x2F;txtai&#x2F;blob&#x2F;master&#x2F;examples&#x2F;16_Train_a_text_labeler.ipynb" rel="nofollow">https:&#x2F;&#x2F;colab.research.google.com&#x2F;github&#x2F;neuml&#x2F;txtai&#x2F;blob&#x2F;ma...</a>
评论 #35504355 未加载