TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

New Gemini model significantly outperforms others on Chatbot Arena (LMSYS)

110 点作者 zopper6 个月前

6 条评论

impulser_6 个月前
Based on my testing, this model is significantly better than other Gemini models especially with programming&#x2F;math related tasks. The current Gemini models are pretty useless for anything related to programming&#x2F;math, but this experiment model puts Gemini ahead of GPT4o, and pretty close to Claude 3.5.<p>The major problem with Claude 3.5 is you can&#x27;t have conversation with a large amount of text because you will constantly hit rate limits and it&#x27;s very annoying.<p>This model with a 2 million context window is probably the best model right now for programming.
评论 #42350385 未加载
chenxi96496 个月前
I feel like it&#x27;s at the point where I&#x27;m not too sure how these rankings impact the my choice of LLM. Every time a new model tops the charts, I&#x27;ll try them for a bit and go back to claude-3.5-sonnet. Both for coding and day to day questions.<p>I don&#x27;t know if I&#x27;m just getting used to the claude style of response, or the orangy UI that I kind of find cozy, but I think we need better ways to convey the difference between models.
评论 #42346562 未加载
Alifatisk6 个月前
Claude has been my got to, mainly because of the huge context window. But today, that doesn&#x27;t seem to be the case, or you hit the rate limit pretty quickly and have to wait a whole day.<p>Google Studio with it&#x27;s 2M context window + this experimental version could be a good replacement.
评论 #42365492 未加载
leobg6 个月前
Google has one moat that is often being overlooked: Googlebot. They get to scrape content that is invisible to pretty much every other crawler, thanks to Cloudflare and paywalls.
评论 #42350439 未加载
评论 #42351747 未加载
jug6 个月前
I feel like these are test versions of Gemini Pro 2.0. The changes are too foundational to be mere iterations&#x2F;break date updates for 1.5 Pro.
ralfd6 个月前
What is the new Gemini model? 1.5-pro-002?
评论 #42345226 未加载
评论 #42345221 未加载