TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

50% on HumanEval with just 1.3B model

76 点作者 sytelus将近 2 年前

5 条评论

bigyikes将近 2 年前
For reference, this beat GPT-3.5 which scores 47%, but not GPT-4 which scored a massive 67%.<p>Beating out GPT-3.5 at <i>any</i> task with such a small model is very cool to me.<p>How much longer until these dumb virtual assistants (Siri, Google, Alexa) get replaced with on-device LLMs? We’ve gotta be getting close. These small, optimized models are catching up quickly in so many domains.
asicsp将近 2 年前
Dupe: &quot;Textbooks are all you need&quot; <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36413768">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=36413768</a>
RecycledEle将近 2 年前
I wonder if learning to train Generative AIs will teach us anything about teaching humans? I mean other than the use of AI tutors. Can we determine the usefulness of text by how well it trains AI?
p0w3n3d将近 2 年前
What is B in this context? Billion?
评论 #36416858 未加载
评论 #36417016 未加载
throwaway4good将近 2 年前
What is HumanEval?