TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Why did LLM capability accelerate a few years ago?

1 pointsby turblety8 months ago
When GPT was released, it was a huge milestone in terms of starting a massive growth of AI LLM's. But why? What made this possible? We've done neural networks for years, but why have they suddenly become so good? Is it hardware? Technique? What what the defining moment?

3 comments

jfengel8 months ago
It&#x27;s this:<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Attention_Is_All_You_Need" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Attention_Is_All_You_Need</a><p>They adapted a technique developed for translation, which had already been advancing a lot over the past decade or so.<p>&quot;Attention&quot; requires really big matrices, and they threw truly vast amounts of data at it. People had been developing techniques for managing that sheer amount of computation, including dedicated hardware and GPUs.<p>It&#x27;s still remarkable that it got <i>so</i> good. It&#x27;s as if there is some emergent phenomenon that appeared only when enough data was approached the right way. So it&#x27;s not at all clear whether significant improvements will require another significant discovery, or if it&#x27;s just a matter of evolution from here.
anshumankmr8 months ago
<a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eMlx5fFNoYc&amp;ab_channel=3Blue1Brown" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=eMlx5fFNoYc&amp;ab_channel=3Blue...</a>
verdverm8 months ago
Transformers for text, which had been used for images prior. (gross simplification)<p>This is also what limits them in other ways