TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Pre-Training GPT-4.5 [video]

4 点作者 waynenilsen大约 1 个月前

1 comment

waynenilsen大约 1 个月前
The only reason that I&#x27;m sharing this is because there is a gem at the end. From the transcript<p>44:26 its responses but it&#x27;s incredible it is incredible related to that and sort of last question in some sense this whole 44:33 effort which was hugely expensive in terms of people and time and dollars and everything else was an experiment to 44:41 further validate that the scaling laws keep going and why and turns out they do and they 44:48 probably keep going for a long time um I accept scaling laws like I accept quantum mechanics or something but they 44:54 still don&#x27;t like I still don&#x27;t know why like why should that be a property of the universe so why are scaling laws a 45:01 property of the universe<p>you want I can I can take a stab well the the fact that more compression will lead to more 45:07 intelligence that has this very strong philosophical grounding so the question is why does training bigger models for 45:15 longer give you more compression and there are a lot of theories here 45:20 there&#x27;s the one I like is that the the relevant concepts are sort of uh sparse 45:27 in the in the the data of the world and in particularly it&#x27;s is a power law so 45:34 that the like the hundth uh most important concept appears in one out of 45:39 a hundred of the documents or or whatever so there&#x27;s long tales does that mean that<p>if we make a perfect data set 45:44 and figure out very data efficient algorithms i mean can go home it it means that there&#x27;s potentially 45:50 exponential compute wins on the table to be very s sophisticated about your choice of data but but basically when 45:59 you just scoop up data passively you&#x27;re going to require 10xing your compute and your 46:07 data to to get the next constant number of things in that tail and there&#x27;s just that tail keeps 46:14 going it&#x27;s long you keep you can keep uh mining it although as you alluded to you 46:22 can probably do a lot better<p>i think that&#x27;s a good place to leave it 46:28 thank you guys very much that was fun yeah thank you
评论 #43691434 未加载