Gemini 2.5 Pro Preview

689 pointsby meetpateltech2 days ago

56 comments

My frustration with using these models for programming in the past has largely been around their tendency to hallucinate APIs that simply don't exist. The Gemini 2.5 models, both pro and flash, seem significantly less susceptible to this than any other model I've tried.There are still significant limitations, no amount of prompting will get current models to approach abstraction and architecture the way a person does. But I'm finding that these Gemini models are finally able to replace searches and stackoverflow for a lot of my day-to-day programming.

评论 #43908703 未加载

评论 #43907830 未加载

评论 #43910937 未加载

评论 #43907651 未加载

评论 #43907312 未加载

评论 #43906496 未加载

评论 #43906661 未加载

评论 #43908374 未加载

评论 #43908611 未加载

评论 #43907663 未加载

评论 #43906620 未加载

评论 #43907338 未加载

评论 #43910472 未加载

评论 #43907484 未加载

评论 #43908877 未加载

评论 #43907566 未加载

评论 #43908177 未加载

评论 #43909585 未加载

评论 #43909343 未加载

评论 #43908754 未加载

评论 #43907058 未加载

评论 #43912002 未加载

评论 #43907715 未加载

评论 #43907780 未加载

评论 #43908283 未加载

评论 #43906396 未加载

评论 #43906390 未加载

评论 #43908982 未加载

评论 #43908948 未加载

评论 #43908002 未加载

评论 #43908894 未加载

评论 #43911780 未加载

评论 #43910032 未加载

评论 #43907417 未加载

评论 #43908560 未加载

评论 #43911520 未加载

paulirish2 days ago

> Gemini 2.5 Pro now ranks #1 on the WebDev Arena leaderboardIt'd make sense to rename WebDev Arena to React/Tailwind Arena. Its system prompt requires [1] those technologies and the entire tool breaks when requesting vanilla JS or other frameworks. The second-order implications of models competing on this narrow definition of webdev are rather troublesome.[1] <a href="https://blog.lmarena.ai/blog/2025/webdev-arena/#:~:text=PROMPTING%20STRATEGY%20AND%20SYSTEM%20DESIGN" rel="nofollow">https://blog.lmarena.ai/blog/2025/webdev-arena/#:~:text=PROM...</a>

评论 #43909199 未加载

评论 #43909095 未加载

评论 #43910510 未加载

评论 #43909091 未加载

评论 #43908192 未加载

ranyume2 days ago

I don't know if I'm doing something wrong, but every time I ask gemini 2.5 for code it outputs SO MANY comments. An exaggerated amount of comments. Sections comments, step comments, block comments, inline comments, all the gang.

评论 #43906755 未加载

评论 #43906347 未加载

评论 #43906492 未加载

评论 #43906383 未加载

评论 #43907504 未加载

评论 #43907480 未加载

评论 #43910926 未加载

评论 #43906328 未加载

评论 #43907547 未加载

评论 #43906394 未加载

评论 #43906672 未加载

评论 #43907131 未加载

评论 #43907281 未加载

评论 #43906532 未加载

评论 #43912107 未加载

评论 #43906353 未加载

评论 #43907998 未加载

评论 #43908151 未加载

评论 #43906279 未加载

评论 #43907124 未加载

评论 #43910595 未加载

评论 #43907256 未加载

评论 #43913528 未加载

评论 #43906371 未加载

评论 #43906280 未加载

评论 #43907540 未加载

评论 #43908786 未加载

评论 #43913286 未加载

评论 #43907104 未加载

评论 #43906278 未加载

laborcontract2 days ago

My guess is that they've done a lot of tuning to improve diff based code editing. Gemini 2.5 is fantastic at agentic work, but it still is pretty rough around the edges in terms of generating perfectly matching diffs to edit code. It's probably one of the very few issues with the model. Luckily, aider tracks this.They measure the old gemini 2.5 generating proper diffs 92% of the time. I bet this goes up to ~95-98% <a href="https://aider.chat/docs/leaderboards/" rel="nofollow">https://aider.chat/docs/leaderboards/</a>Question for the google peeps who monitor these threads: Is gemini-2.5-pro-exp (free tier) updated as well, or will it go away?Also, in the blog post, it says:<pre><code> > The previous iteration (03-25) now points to the most recent version (05-06), so no action is required to use the improved model, and it continues to be available at the same price. </code></pre> Does this mean gemini-2.5-pro-preview-03-25 now uses 05-06? Does the same apply to gemini-2.5-pro-exp-03-25?update: I just tried updating the date in the exp model (gemini-2.5-pro-exp-05-06) and that doesnt work.

评论 #43908111 未加载

评论 #43906744 未加载

评论 #43909183 未加载

评论 #43907531 未加载

mohsen12 days ago

I use Gemini for almost everything. But their model card[1] only compares to o3-mini! In known benchmarks o3 is still ahead:<pre><code> +------------------------------+---------+--------------+ | Benchmark | o3 | Gemini 2.5 | | | | Pro | +------------------------------+---------+--------------+ | ARC-AGI (High Compute) | 87.5% | — | | GPQA Diamond (Science) | 87.7% | 84.0% | | AIME 2024 (Math) | 96.7% | 92.0% | | SWE-bench Verified (Coding) | 71.7% | 63.8% | | Codeforces Elo Rating | 2727 | — | | MMMU (Visual Reasoning) | 82.9% | 81.7% | | MathVista (Visual Math) | 86.8% | — | | Humanity’s Last Exam | 26.6% | 18.8% | +------------------------------+---------+--------------+ </code></pre> [1] <a href="https://storage.googleapis.com/model-cards/documents/gemini-2.5-pro-preview.pdf" rel="nofollow">https://storage.googleapis.com/model-cards/documents/gemini-...</a>

评论 #43907806 未加载

评论 #43908583 未加载

评论 #43909081 未加载

andy12_2 days ago

Interestingly, when compering benchmarks of Experimental 03-25 [1] and Experimental 05-06 [2] it seems the new version scores slightly lower in everything except on LiveCodeBench.[1] <a href="https://storage.googleapis.com/model-cards/documents/gemini-2.5-pro-preview.pdf" rel="nofollow">https://storage.googleapis.com/model-cards/documents/gemini-...</a> [2] <a href="https://deepmind.google/technologies/gemini/" rel="nofollow">https://deepmind.google/technologies/gemini/</a>

评论 #43907116 未加载

评论 #43906668 未加载

评论 #43907441 未加载

评论 #43906977 未加载

评论 #43910775 未加载

planb1 day ago

> We’ve seen developers doing amazing things with Gemini 2.5 Pro, so we decided to release an updated version a couple of weeks early to get into developers hands sooner. Today we’re excited to release Gemini 2.5 Pro Preview (I/O edition).What's up with AI companies and their model naming? So is this an updated 2.5 Pro and they indicate it by appending "Preview" to the name? Or was it always called 2.5 Preview and this is an updated "Preview"? Why isn't it 2.6 Pro or 2.5.1 Pro?

评论 #43916784 未加载

评论 #43918216 未加载

killerstorm2 days ago

Why can't they just use version numbers instead of this "new preview" stuff?E.g. call it Gemini Pro 2.5.1.

评论 #43906801 未加载

评论 #43907185 未加载

herpdyderp2 days ago

I agree it's very good but the UI is still usually an unusable, scroll-jacking disaster. I've found it's best to let a chat sit for around a few minutes after it has finished printing the AI's output. Finding the `ms-code-block` element in dev tools and logging `$0.textContext` is reliable too.

评论 #43906408 未加载

评论 #43918783 未加载

评论 #43906348 未加载

评论 #43913547 未加载

arnaudsm2 days ago

Be careful, this model is worse than 03-25 in 10 of the 12 benchmarks (!)I bet they kept training on coding, made everything worse on the way, and tried to hide it under the rug because of the sunk costs.

评论 #43906910 未加载

评论 #43913640 未加载

评论 #43907520 未加载

ionwake2 days ago

Is it possible to sue this with Cursor? If so what is the name of the model? gemini-2.5-pro-preview ?edit> Its gemini-2.5-pro-preview-05-06edit>Cursor syas it doesnt have "good support" et, but im not sure if this is a defualt message when it doesnt recognise a model? is this a big deal? should I wait until its officially supported by cursor?Just trying to save time here for everyone - anyone know the answer?

评论 #43907741 未加载

评论 #43907517 未加载

评论 #43908331 未加载

minzi1 day ago

I use Gemini inside cursor, but the web app is basically unusable to me. Of the big three, only Claude seems to have a sensible web app with good markdown formatting, converting big pastes into attachments, and breaking out code into side panels. These seem like relatively obvious features so it’s confusing to me that Google is so behind on the UI here.

评论 #43914411 未加载

franze2 days ago

I like it. I threw some random concepts at it (Neon, LSD, Falling, Elite, Shooter, Escher + Mobile Game + SPA) at it and this is what it came up with after a few (5x) roundtrips.<a href="https://show.franzai.com/a/star-zero-huge?nobuttons" rel="nofollow">https://show.franzai.com/a/star-zero-huge?nobuttons</a>

simonw2 days ago

Here's a summary of the 394 comments on this post created using the new gemini-2.5-pro-preview-05-06. It looks very good to me - well grouped, nicely formatted.<a href="https://gist.github.com/simonw/7ef3d77c8aeeaf1bfe9cc6fd68760b96" rel="nofollow">https://gist.github.com/simonw/7ef3d77c8aeeaf1bfe9cc6fd68760...</a>30,408 input, 8,535 output = 12.336 cents.8,500 is a very long output! Finally a model that obeys my instructions to "go long" when summarizing Hacker News threads. Here's the script I used: <a href="https://gist.github.com/simonw/7ef3d77c8aeeaf1bfe9cc6fd68760b96?permalink_comment_id=5568631#gistcomment-5568631" rel="nofollow">https://gist.github.com/simonw/7ef3d77c8aeeaf1bfe9cc6fd68760...</a>

评论 #43912374 未加载

评论 #43911954 未加载

siwakotisaurav2 days ago

Usually don’t believe the benchmarks but first in web dev arena specifically is crazy. That one has been Claude for so long, which tracks in my experience

评论 #43906295 未加载

评论 #43906750 未加载

mliker2 days ago

The "video to learning app" feature is a cool concept (see it in AI Studio). I just passed in two separate Stanford lectures to see if it could come up with an interesting interactive app. The apps it generated weren't too useful, but I can see with more focus and development, it'd be a game changer for education.

评论 #43908263 未加载

评论 #43909106 未加载

crat3r2 days ago

So, are people using these tools without the org they work for knowing? The amount of hoops I would have to jump through to get either of the smaller companies I have worked for since the AI boom to let me use a tool like this would make it absolutely not worth the effort.I'm assuming large companies are mandating it, but ultimately the work that these LLMs seem poised for would benefit smaller companies most and I don't think they can really afford using them? Are people here paying for a personal subscription and then linking it to their work machines?

评论 #43907848 未加载

评论 #43906599 未加载

评论 #43906587 未加载

评论 #43906384 未加载

zoogeny1 day ago

I continue to find Gemini 2.5 Pro to be the most capable model. I leave Cursor on "Auto" model selection but all of my directed interactions are with Gemini. My process right now is to ask Gemini for high-level architecture discussions and broad-stroke implementation task break downs, then I use Cursor to validate and execute on those plans, then Gemini to review the generated code.That process works pretty well but not perfectly. I have two examples where Gemini suggested improvements during the review stage that were actually breaking.As an aside, I was investigating the OpenAI APIs and decided to use ChatGPT since I assumed it would have the most up-to-date information on its own APIs. It felt like a huge step back (it was the free model so I cut it some slack). It not only got its own APIs completely wrong [1], but when I pasted the url for the correct API doc into the chat it still insisted that what was written on the page was the wrong API and pointed me back to the page I had just linked to justify it's incorrectness. It was only after I prompted that the new API was possibly outside of its training data that it actually got to the correct analysis. I also find the excessive use of emojis to be juvenile, distracting and unhelpful.1. <a href="https://chatgpt.com/share/681ba964-0240-800c-8fb8-c23a2cae09bf" rel="nofollow">https://chatgpt.com/share/681ba964-0240-800c-8fb8-c23a2cae09...</a>

jmward012 days ago

Google's models are pretty good, but their API(s) and guarantees aren't. We were just told today that 'quota doesn't guarantee capacity' so basically on-demand isn't prod capable. Add to that that there isn't a second vendor source like Anthropic and OpenAI have and Google's reliability makes it a hard sell to use them unless you can back up the calls with a different model family all together.

评论 #43910624 未加载

ramesh312 days ago

>Best-in-class frontend web developmentIt really is wild to have seen this happen over the last year. The days of traditional "design-to-code" FE work are completely over. I haven't written a line of HTML/CSS in months. If you are still doing this stuff by hand, you need to adapt fast. In conjunction with an agentic coding IDE and a few MCP tools, weeks worth of UI work are now done in hours to a higher level of quality and consistency with practically zero effort.

评论 #43906703 未加载

评论 #43906271 未加载

评论 #43906360 未加载

评论 #43906546 未加载

评论 #43906240 未加载

评论 #43907263 未加载

评论 #43906252 未加载

评论 #43906259 未加载

djrj477dhsnv2 days ago

I don't understand what I'm doing wrong.. it seems like everyone is saying Gemini is better, but I've compared dozens of examples from my work, and Grok has always produced better results.

评论 #43906794 未加载

评论 #43906968 未加载

评论 #43906701 未加载

评论 #43911287 未加载

评论 #43906427 未加载

snthpy1 day ago

I find the naming confusing. Haven't I already been using Gemini 2.5 Pro Preview for the past month? Or was that Experimental?Also how do i understand the OpenAI model names? I don't use OpenAI anymore since Ilya left but when looking at the benchmarks I'm constantly confused by their model names. We have semantic versioning - why do I need an AI or web search to understand your model name?

artdigital2 days ago

Gemini 2.5 pro is great, but also VERY expensive with non opaque cost insightsJust recently a lot of people (me included) got hit with a surprise bill, with some racking up $500 in cost for normal useI certainly got burnt and removed my API key from my tools to not accidentally use it againExample: <a href="https://x.com/pashmerepat/status/1918084120514900395?s=46" rel="nofollow">https://x.com/pashmerepat/status/1918084120514900395?s=46</a>

评论 #43910843 未加载

评论 #43911881 未加载

m_kos2 days ago

[Tangent] Anyone here using 2.5 Pro in Gemini Advanced? I have been experiencing a ton of bugs, e.g.,:- [codes] showing up instead of references,- raw search tool output sliding across the screen,- Gemini continusly answering questions asked two or more messages before but ignoring the most recent one (you need to ask Gemini an unrelated question for it to snap out of this bug for a few minutes),- weird messages including text irrelevant to any of my chats with Gemini, like baseball,- confusing its own replies with mine,- not being able to run its own Python code due to some unsolvable formatting issue,- timeouts, and more.

评论 #43908805 未加载

thevillagechief2 days ago

I've been switching between this and GPT-4o at work, and Gemini is really verbose. But I've been primarily using it. I'm confused though, the model available in copilot says Gemini 2.5 Pro (Preview), and I've had it for a few weeks. This was just released today. Is this an updated preview? If so, the blog/naming is confusing.

ramoz2 days ago

Never sleep on Google.

wewewedxfgdf2 days ago

Gemini does not accept upload of TSX files, it says "File type unsupported"You must rename your files to .tsx.txt THEN IT ACCEPTS THEM and works perfectly fine writing TSX code.This is absolutely bananas. How can such a powerful coding engine have this behavior?

评论 #43909530 未加载

seidleroni1 day ago

I'm not sure if this is just me, but with the "Starter Apps" I don't see how you can extend them using AI in aistudio. For example, there doesn't seem to be a way to add more code to the app with AI, even if you copy the Starter App. Am I missing something, or is this just a big miss from Google?

javiercr1 day ago

Meanwhile Gemini 2.5 Pro support for VSCode Copilot is still broken :/<a href="https://github.com/microsoft/vscode-copilot-release/issues/8404">https://github.com/microsoft/vscode-copilot-release/issues/8...</a>

niteshpant1 day ago

My biggest frustration right now is just how much verbose the output is. Like a freshman aiming to hit that word count without substance, the model just spits out GenAI fluff.Good thinking otherwise.

llm_nerd2 days ago

Their nomenclature is a bit confused. The Gemini web app has a 2.5 Pro (experimental), yet this apparently is referring to 2.5 Pro Preview 05-06.Would be ideal if they incremented the version number or the like.

xnx2 days ago

This is much bigger news than OpenAI's acquisition of WindSurf.

croemer1 day ago

> We have also updated the model card with the new version of 2.5 ProNo you haven't? At least not at 6am UTC on May 7. The PDF still mentions (03-25) as date of the model.What version do I get on gemini.google.com when I select "2.5 Pro (experimental)"? Has anything changed there or not (yet)?

martinald2 days ago

I'm totally lost again! If I use Gemini on the website (gemini.google.com), am I using 2.5 Pro IO edition, or am I using the old one?

评论 #43906814 未加载

评论 #43906534 未加载

评论 #43907156 未加载

qwertox2 days ago

I have my issues with the code Gemini Pro in AI Studio generates without customized "System Instructions".It turns a well readable code-snippet of 5 lines into a 30 line snippet full of comments and mostly unnecessary error handling. Code which becomes harder to reason about.But for sysadmin tasks, like dealing with ZFS and LVM, it is absolutely incredible.

评论 #43908311 未加载

seatac761 day ago

I’m always in awe of LLM code generation capabilities but it does make me sad, it’s not joyful and I feel terrible that I’m not learning anything.

elAhmo1 day ago

I honestly had to stop and think "wait a minute, was't 2.5 Pro out a few months ago? how come it is in preview now?"Google releasing a new model (as it has a blog post, announcement, can be chosen in the API) called 2.5 Pro Preview, while having a 2.5 Pro already out for months is ridiculous. I thought it was just OpenAI that is unable to use its dozens of billions of dollars to come up with a normal naming scheme - yet here we are with another trillion dollar company being unable to settle on a versioning scheme that is not confusing.

gitroom2 days ago

man that endless commenting seriously kills my flow - gotta say, even after all the prompts and hacks, still can't get these models to chill out. you think we'll ever get ai to stop overdoing it and actually fit real developer habits or is it always gonna be like this?

CSMastermind2 days ago

Hasn't Gemini 2.5 Pro been out for a while?At first I was very impressed with it's coding abilities, switching off of Claud for it but recently I've been using GPT o3 which I find is much more concise and generally better at problem solving when you hit an error.

评论 #43906619 未加载

childintime2 days ago

How does it perform on anything but Python and Javascript? In my experience my milage varied a lot when using C#, for example, or Zig, so I've learnt to just let it select the language it wants.Also, why doesn't Ctrl+C work??

评论 #43907364 未加载

mattmcknight1 day ago

I'm getting a little tired of so many versions of 2.5. Whatever happened to 2.5.0, 2.5.1, et cetera?

oellegaard2 days ago

Is there anything like Claude code for other models such as gemini?

评论 #43906490 未加载

评论 #43907267 未加载

评论 #43907482 未加载

评论 #43907119 未加载

评论 #43906752 未加载

EliasWatson2 days ago

I wonder how the latest version of Grok 3 would stack up to Gemini 2.5 Pro on the web dev arena leaderboard. They are still just showing the original early access model for some reason, despite there being API access to the latest model. I've been using Grok 3 with Aider Chat and have been very impressed with it. I get $150 of free API credits every month by allowing them to train on my data, which I'm fine with since I'm just working on personal side projects. Gemini 2.5 Pro and Claude 3.7 might be a little better than Grok 3, but I can't justify the cost when Grok doesn't cost me a penny to use.

cadamsdotcom2 days ago

Google/Alphabet is a giant hulking machine that’s been frankly running at idle. All that resume driven development and performance review promo cycles and retention of top talent mainly to work on ad tech means it’s packed to the rafters with latent capability. Holding on to so much talent in the face of basically having nothing to do is a testament to the company’s leadership - even if said leadership didn’t manage to make Google push humanity forward over the last decade or so.Now there’s a big nugget to chew (LLMs) you’re seeing that latent capability come to life. This awakening feels more bottom-up driven than top down. Google’s a war machine chugging along nicely in peacetime, but now its war again!Hats off to the engineers working on the tech. Excited to try it out!

评论 #43909964 未加载

mvdtnz2 days ago

I truly do not understand how people are getting worthwhile results from Gemini 2.5 Pro. I have used all of the major models for lots of different programming tasks and I have never once had Gemini produce something useful. It's not just wrong, it's laughably bad. And people are making claims that it's the best. I just... don't... get it.

评论 #43910501 未加载

seydor1 day ago

I just wish all version numbers were just dates. Gemini 25.5.6

nashashmi2 days ago

I keep hearing good things about Gemini online and offline. I wrote them off as terrible when they first launched and have not looked back since.How are they now? Sufficiently good? Competent? Competitive? Or limited? My needs are very consumer oriented, not programming/api stuff.

评论 #43907478 未加载

评论 #43907987 未加载

评论 #43907009 未加载

jeswin2 days ago

Now if there was a way to add prepaid credits and monitor usage near real-time on a dashboard, like every other vendor. Hey Google are you listening?

评论 #43906230 未加载

评论 #43907187 未加载

评论 #43906315 未加载

评论 #43906208 未加载

评论 #43906377 未加载

评论 #43906233 未加载

评论 #43906290 未加载

brap2 days ago

Gemini is now ranked #1 across every category in lmarena.

评论 #43907880 未加载

xbmcuser2 days ago

As a non programmer Gemini 2.5 Pro I have been really loving this for my python scripting for manipulating text and excel files for web scraping. In the past I was able to use Chat Gpt to code some of the things that I wanted but with Gemini 2.5 Pro it has been just another level. If they improved it further that would be amazing

obsolete_wagie2 days ago

o3 is so far ahead of antrhopic and google, these models arent even worth using

评论 #43908639 未加载

评论 #43909681 未加载

评论 #43908794 未加载

评论 #43907668 未加载

评论 #43908572 未加载

panarchy2 days ago

Is it just me that finds that while Gemini 2.5 is able to generate a lot of code that the end results are usually lackluster compared to Claude and even ChatGPT? I also find it hard-headed and frequently does things in ways I explicitly told it not to. The massive context window is pretty great though and enables me to do things I can't with the others so it still gets used a lot.

评论 #43907237 未加载

white_beach2 days ago

object?(aider joke)

ionwake2 days ago

Can someone tell me if windsurf is better than cursor? ( pref someone who has used both for a few days? )

评论 #43913576 未加载

评论 #43909623 未加载

评论 #43907574 未加载

xyst2 days ago

Proprietary junk beats DeepSeek by a mere 213 points?Oof. G and others are way behind

评论 #43915943 未加载

alana3142 days ago

The google sheets UI asked me to try Gemini to create a formula, so I tried it, starting with "Create a formula...", and its answer was "Sorry, I can't help with creating formulas yet, but I'm still learning."

56 comments

segphault2 days ago

评论 #43908703 未加载

评论 #43907830 未加载

评论 #43910937 未加载

评论 #43907651 未加载

评论 #43907312 未加载

评论 #43906496 未加载

评论 #43906661 未加载

评论 #43908374 未加载

评论 #43908611 未加载

评论 #43907663 未加载

评论 #43906620 未加载

评论 #43907338 未加载

评论 #43910472 未加载

评论 #43907484 未加载

评论 #43908877 未加载

评论 #43907566 未加载

评论 #43908177 未加载

评论 #43909585 未加载

评论 #43909343 未加载

评论 #43908754 未加载

评论 #43907058 未加载

评论 #43912002 未加载

评论 #43907715 未加载

评论 #43907780 未加载

评论 #43908283 未加载

评论 #43906396 未加载

评论 #43906390 未加载

评论 #43908982 未加载

评论 #43908948 未加载

评论 #43908002 未加载

评论 #43908894 未加载

评论 #43911780 未加载

评论 #43910032 未加载

评论 #43907417 未加载

评论 #43908560 未加载

评论 #43911520 未加载

paulirish2 days ago

评论 #43909199 未加载

评论 #43909095 未加载

评论 #43910510 未加载

评论 #43909091 未加载

评论 #43908192 未加载

ranyume2 days ago

评论 #43906755 未加载

评论 #43906347 未加载

评论 #43906492 未加载

评论 #43906383 未加载

评论 #43907504 未加载

评论 #43907480 未加载

评论 #43910926 未加载

评论 #43906328 未加载

评论 #43907547 未加载

评论 #43906394 未加载

评论 #43906672 未加载

评论 #43907131 未加载

评论 #43907281 未加载

评论 #43906532 未加载

评论 #43912107 未加载

评论 #43906353 未加载

评论 #43907998 未加载

评论 #43908151 未加载

评论 #43906279 未加载

评论 #43907124 未加载

评论 #43910595 未加载

评论 #43907256 未加载

评论 #43913528 未加载

评论 #43906371 未加载

评论 #43906280 未加载

评论 #43907540 未加载

评论 #43908786 未加载

评论 #43913286 未加载

评论 #43907104 未加载

评论 #43906278 未加载

laborcontract2 days ago

评论 #43908111 未加载

评论 #43906744 未加载

评论 #43909183 未加载

评论 #43907531 未加载

mohsen12 days ago

评论 #43907806 未加载

评论 #43908583 未加载

评论 #43909081 未加载

andy12_2 days ago

评论 #43907116 未加载

评论 #43906668 未加载

评论 #43907441 未加载

评论 #43906977 未加载

评论 #43910775 未加载

planb1 day ago

评论 #43916784 未加载

评论 #43918216 未加载

killerstorm2 days ago

Why can't they just use version numbers instead of this "new preview" stuff?E.g. call it Gemini Pro 2.5.1.

评论 #43906801 未加载

评论 #43907185 未加载

herpdyderp2 days ago

评论 #43906408 未加载

评论 #43918783 未加载

评论 #43906348 未加载

评论 #43913547 未加载

arnaudsm2 days ago

评论 #43906910 未加载

评论 #43913640 未加载

评论 #43907520 未加载

ionwake2 days ago

评论 #43907741 未加载

评论 #43907517 未加载

评论 #43908331 未加载

minzi1 day ago

评论 #43914411 未加载

franze2 days ago

simonw2 days ago

评论 #43912374 未加载

评论 #43911954 未加载

siwakotisaurav2 days ago

Usually don’t believe the benchmarks but first in web dev arena specifically is crazy. That one has been Claude for so long, which tracks in my experience

评论 #43906295 未加载

评论 #43906750 未加载

mliker2 days ago

评论 #43908263 未加载

评论 #43909106 未加载

crat3r2 days ago

评论 #43907848 未加载

评论 #43906599 未加载

评论 #43906587 未加载

评论 #43906384 未加载

zoogeny1 day ago

jmward012 days ago

评论 #43910624 未加载

ramesh312 days ago

评论 #43906703 未加载

评论 #43906271 未加载

评论 #43906360 未加载

评论 #43906546 未加载

评论 #43906240 未加载

评论 #43907263 未加载

评论 #43906252 未加载

评论 #43906259 未加载

djrj477dhsnv2 days ago

I don't understand what I'm doing wrong.. it seems like everyone is saying Gemini is better, but I've compared dozens of examples from my work, and Grok has always produced better results.

评论 #43906794 未加载

评论 #43906968 未加载

评论 #43906701 未加载

评论 #43911287 未加载

评论 #43906427 未加载

snthpy1 day ago

artdigital2 days ago

评论 #43910843 未加载

评论 #43911881 未加载

m_kos2 days ago

评论 #43908805 未加载

thevillagechief2 days ago

ramoz2 days ago

Never sleep on Google.

wewewedxfgdf2 days ago

评论 #43909530 未加载

seidleroni1 day ago

javiercr1 day ago

niteshpant1 day ago

llm_nerd2 days ago

xnx2 days ago

This is much bigger news than OpenAI's acquisition of WindSurf.

croemer1 day ago

martinald2 days ago

I'm totally lost again! If I use Gemini on the website (gemini.google.com), am I using 2.5 Pro IO edition, or am I using the old one?

评论 #43906814 未加载

评论 #43906534 未加载

评论 #43907156 未加载

qwertox2 days ago

评论 #43908311 未加载

seatac761 day ago

I’m always in awe of LLM code generation capabilities but it does make me sad, it’s not joyful and I feel terrible that I’m not learning anything.

elAhmo1 day ago

gitroom2 days ago

CSMastermind2 days ago

评论 #43906619 未加载

childintime2 days ago

评论 #43907364 未加载

mattmcknight1 day ago

I'm getting a little tired of so many versions of 2.5. Whatever happened to 2.5.0, 2.5.1, et cetera?

oellegaard2 days ago

Is there anything like Claude code for other models such as gemini?

评论 #43906490 未加载

评论 #43907267 未加载

评论 #43907482 未加载

评论 #43907119 未加载

评论 #43906752 未加载

EliasWatson2 days ago

cadamsdotcom2 days ago

评论 #43909964 未加载

mvdtnz2 days ago

评论 #43910501 未加载

seydor1 day ago

I just wish all version numbers were just dates. Gemini 25.5.6

nashashmi2 days ago

评论 #43907478 未加载

评论 #43907987 未加载

评论 #43907009 未加载

jeswin2 days ago

Now if there was a way to add prepaid credits and monitor usage near real-time on a dashboard, like every other vendor. Hey Google are you listening?

评论 #43906230 未加载

评论 #43907187 未加载

评论 #43906315 未加载

评论 #43906208 未加载

评论 #43906377 未加载

评论 #43906233 未加载

评论 #43906290 未加载

brap2 days ago

Gemini is now ranked #1 across every category in lmarena.

评论 #43907880 未加载

xbmcuser2 days ago

obsolete_wagie2 days ago

o3 is so far ahead of antrhopic and google, these models arent even worth using

评论 #43908639 未加载

评论 #43909681 未加载

评论 #43908794 未加载

评论 #43907668 未加载

评论 #43908572 未加载

panarchy2 days ago

评论 #43907237 未加载

white_beach2 days ago

object?(aider joke)

ionwake2 days ago

Can someone tell me if windsurf is better than cursor? ( pref someone who has used both for a few days? )

评论 #43913576 未加载

评论 #43909623 未加载

评论 #43907574 未加载

xyst2 days ago

Proprietary junk beats DeepSeek by a mere 213 points?Oof. G and others are way behind

评论 #43915943 未加载

alana3142 days ago