TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Model training diary/journal for LLMs?

1 点作者 nalzok超过 1 年前
About half a year ago, some big tech company released an open-source LLM. What makes that model special is that they made available a model training diary&#x2F;journal recording everything their engineers did to babysit the training process, e.g. &quot;on day 143, the training loss plateaued, so we decreased the learning rate further&quot;. I think it was in a shared Google Doc.<p>Can you remind me of the name of the company&#x2F;model?

1 comment

nalzok超过 1 年前
Nevermind, I figured it out: <a href="https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;metaseq&#x2F;blob&#x2F;main&#x2F;projects&#x2F;OPT&#x2F;chronicles&#x2F;OPT175B_Logbook.pdf">https:&#x2F;&#x2F;github.com&#x2F;facebookresearch&#x2F;metaseq&#x2F;blob&#x2F;main&#x2F;projec...</a>