TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

GPT-4: 8 x 220B experts trained with different data/task distributions

47 点作者 MasterScrat将近 2 年前

4 条评论

SheinhardtWigCo将近 2 年前
As a heavy user of GPT-4 (I&#x27;m working on a plugin), reading this felt like a puzzle piece being dropped into place.<p>Maybe this is just confirmation bias, but yeah, trying to push the model&#x27;s capabilities is like working with a committee of brilliant minds chaired by an idiot.<p>Also, I can see why they kept this secret. Competitors just shaved months off their R&amp;D timelines.
euclaise将近 2 年前
The only paper that I could find using an approach with fully separated experts like this is <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2208.03306.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2208.03306.pdf</a>
swyx将近 2 年前
the source podcast that this came from: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36407269">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36407269</a>
adeon将近 2 年前
Is this an actually confirmed detail or just something George Hotz speculated? How credible is it?
评论 #36419526 未加载