TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: How are LLMs getting smarter beyond just scaling?

2 点作者 chenxi96496 个月前
I have a question for those who deeply understand LLMs.<p>From what I understand, the leap from GPT-2 to GPT-3 was mostly about scaling - more compute, more data. GPT-3 to 4 probably followed the same path.<p>But in the year and a half since GPT-4, LLMs have gotten significantly better, especially the smaller ones. I&#x27;m consistently impressed by models like Claude 3.5 Sonnet, despite us supposedly reaching scaling limits.<p>What&#x27;s driving these improvements? Is it thousands of small optimizations in data cleaning, training, and prompting? Or am I just deep enough in tech now that I&#x27;m noticing subtle changes more? Really curious to hear from people who understand the technical internals here.

暂无评论

暂无评论