TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Fine-Tuning Transformers for NLP

67 点作者 dylanbfox将近 4 年前

3 条评论

uniqueuid将近 4 年前
For anyone looking to fine-train transformers with less work, there is the FARM project (<a href="https:&#x2F;&#x2F;github.com&#x2F;deepset-ai&#x2F;FARM" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;deepset-ai&#x2F;FARM</a>) which has some more or less ready-to-go configurations (classification, question answering, NER, and a couple of others). It&#x27;s really almost &quot;plug in a csv and run&quot;.<p>By the way, a pet peeve is sentiment detection. It&#x27;s a useful method, but please be aware that it does not measure &quot;sentiment&quot; in a way that one would normally think, and that what it measures varies strongly across methods (<a href="https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;abs&#x2F;10.1080&#x2F;19312458.2020.1869198" rel="nofollow">https:&#x2F;&#x2F;www.tandfonline.com&#x2F;doi&#x2F;abs&#x2F;10.1080&#x2F;19312458.2020.18...</a>).
评论 #27582388 未加载
whimsicalism将近 4 年前
Hm. I read this expecting a more in-depth discussion about best practices for fine-tuning massive transformers while avoiding catastrophic forgetting, ie.<p>* How should you select the learning rate?<p>* What tasks are best for fine-tuning on small amounts of data? etc.<p>Instead, this seems mostly to just be running through the implementation of ML&#x2F;DL 101: loss function for binary classification, helper functions to load data, etc.
评论 #27581728 未加载
visarga将近 4 年前
The same transformer diagram from the original paper, replicated everywhere. Nobody got time for redrawing.<p>BTW, take a look at &quot;sentence transformers&quot; library, a nice interface on top of Hugging Face for this kind of operations (reusing, fine-tuning).<p><a href="https:&#x2F;&#x2F;www.sbert.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.sbert.net&#x2F;</a>