Model training diary/journal for LLMs?

1 点作者 nalzok超过 1 年前

About half a year ago, some big tech company released an open-source LLM. What makes that model special is that they made available a model training diary/journal recording everything their engineers did to babysit the training process, e.g. "on day 143, the training loss plateaued, so we decreased the learning rate further". I think it was in a shared Google Doc.<p>Can you remind me of the name of the company/model?

1 comment

nalzok超过 1 年前

Nevermind, I figured it out: <a href="https://github.com/facebookresearch/metaseq/blob/main/projects/OPT/chronicles/OPT175B_Logbook.pdf">https://github.com/facebookresearch/metaseq/blob/main/projec...</a>