TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Why did LLM capability accelerate a few years ago?

1 点作者 turblety8 个月前
When GPT was released, it was a huge milestone in terms of starting a massive growth of AI LLM's. But why? What made this possible? We've done neural networks for years, but why have they suddenly become so good? Is it hardware? Technique? What what the defining moment?

3 条评论

jfengel8 个月前
It&#x27;s this:<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Attention_Is_All_You_Need" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Attention_Is_All_You_Need</a><p>They adapted a technique developed for translation, which had already been advancing a lot over the past decade or so.<p>&quot;Attention&quot; requires really big matrices, and they threw truly vast amounts of data at it. People had been developing techniques for managing that sheer amount of computation, including dedicated hardware and GPUs.<p>It&#x27;s still remarkable that it got <i>so</i> good. It&#x27;s as if there is some emergent phenomenon that appeared only when enough data was approached the right way. So it&#x27;s not at all clear whether significant improvements will require another significant discovery, or if it&#x27;s just a matter of evolution from here.
anshumankmr8 个月前
<a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eMlx5fFNoYc&amp;ab_channel=3Blue1Brown" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eMlx5fFNoYc&amp;ab_channel=3Blue...</a>
verdverm8 个月前
Transformers for text, which had been used for images prior. (gross simplification)<p>This is also what limits them in other ways