TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

RedPajama at 440B tokens higher quality than Pythia and StableLM

6 点作者 jamiedg大约 2 年前

1 comment

jamiedg大约 2 年前
A week ago we announced RedPajama, a project to create leading open-source models. We released the first step in the project a training dataset of over 1.2 trillion tokens following the LLaMA recipe.<p>Today we shared progress on training our first model on this dataset, a 7B parameter model using the Pythia architecture. So far we are a bit less than 50% through the training - 440B parameters. We published HELM benchmark results on 16 different scenarios for this checkpoint, showing the model accuracy to be quite high for this stage of training.
评论 #35697552 未加载