TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

New Gemini model significantly outperforms others on Chatbot Arena (LMSYS)

110 pointsby zopper6 months ago

6 comments

impulser_6 months ago
Based on my testing, this model is significantly better than other Gemini models especially with programming&#x2F;math related tasks. The current Gemini models are pretty useless for anything related to programming&#x2F;math, but this experiment model puts Gemini ahead of GPT4o, and pretty close to Claude 3.5.<p>The major problem with Claude 3.5 is you can&#x27;t have conversation with a large amount of text because you will constantly hit rate limits and it&#x27;s very annoying.<p>This model with a 2 million context window is probably the best model right now for programming.
评论 #42350385 未加载
chenxi96496 months ago
I feel like it&#x27;s at the point where I&#x27;m not too sure how these rankings impact the my choice of LLM. Every time a new model tops the charts, I&#x27;ll try them for a bit and go back to claude-3.5-sonnet. Both for coding and day to day questions.<p>I don&#x27;t know if I&#x27;m just getting used to the claude style of response, or the orangy UI that I kind of find cozy, but I think we need better ways to convey the difference between models.
评论 #42346562 未加载
Alifatisk5 months ago
Claude has been my got to, mainly because of the huge context window. But today, that doesn&#x27;t seem to be the case, or you hit the rate limit pretty quickly and have to wait a whole day.<p>Google Studio with it&#x27;s 2M context window + this experimental version could be a good replacement.
评论 #42365492 未加载
leobg6 months ago
Google has one moat that is often being overlooked: Googlebot. They get to scrape content that is invisible to pretty much every other crawler, thanks to Cloudflare and paywalls.
评论 #42350439 未加载
评论 #42351747 未加载
jug5 months ago
I feel like these are test versions of Gemini Pro 2.0. The changes are too foundational to be mere iterations&#x2F;break date updates for 1.5 Pro.
ralfd6 months ago
What is the new Gemini model? 1.5-pro-002?
评论 #42345226 未加载
评论 #42345221 未加载