TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Fine-Tuning Transformers for NLP

67 pointsby dylanbfoxalmost 4 years ago

3 comments

uniqueuidalmost 4 years ago
For anyone looking to fine-train transformers with less work, there is the FARM project (<a href="https:&#x2F;&#x2F;github.com&#x2F;deepset-ai&#x2F;FARM" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;deepset-ai&#x2F;FARM</a>) which has some more or less ready-to-go configurations (classification, question answering, NER, and a couple of others). It&#x27;s really almost &quot;plug in a csv and run&quot;.<p>By the way, a pet peeve is sentiment detection. It&#x27;s a useful method, but please be aware that it does not measure &quot;sentiment&quot; in a way that one would normally think, and that what it measures varies strongly across methods (<a href="https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;abs&#x2F;10.1080&#x2F;19312458.2020.1869198" rel="nofollow">https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;abs&#x2F;10.1080&#x2F;19312458.2020.18...</a>).
评论 #27582388 未加载
whimsicalismalmost 4 years ago
Hm. I read this expecting a more in-depth discussion about best practices for fine-tuning massive transformers while avoiding catastrophic forgetting, ie.<p>* How should you select the learning rate?<p>* What tasks are best for fine-tuning on small amounts of data? etc.<p>Instead, this seems mostly to just be running through the implementation of ML&#x2F;DL 101: loss function for binary classification, helper functions to load data, etc.
评论 #27581728 未加载
visargaalmost 4 years ago
The same transformer diagram from the original paper, replicated everywhere. Nobody got time for redrawing.<p>BTW, take a look at &quot;sentence transformers&quot; library, a nice interface on top of Hugging Face for this kind of operations (reusing, fine-tuning).<p><a href="https:&#x2F;&#x2F;www.sbert.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.sbert.net&#x2F;</a>