TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

I made 4000 agent calls in Cursor last month. Each model has a personality

8 点作者 mike2103 天前
The lazy architect (OpenAI’s o3). o3 is incredibly lazy at writing code, but very good at planning. Will happily read tens of files and do deep analysis, but often struggles in scenarios where it needs to edit more than one file.<p>The over-eager child (Claude Sonnet 3.7 Thinking). Claude Sonnet is eager to just get going, man! It’s not the most careful, and in longer strings of tool calls, may start editing something completely unrelated to what you asked it to.<p>Pretty balanced?(Gemini 2.5 Pro). Gemini 2.5 is a little more intelligent, and significantly faster and more reserved than Sonnet 3.7. Usually the best choice for writing code in multiple files.<p>I’ve found o4-mini to be incredibly slow and fairly mediocre, and GPT 4.1 useful in very situational areas. My tips:<p>- Use o3 to plan and&#x2F;or write code in one or max two file only. If you do more, it may openly revolt and just refuse to write any longer.<p>- Always make sure Sonnet 3.7 is following a tightly scoped plan on a relatively small section of the product, and supervise it. If you have an easy change to make in many areas of your codebase, for example, letting Sonnet run, still supervised, is a perfect use of the model’s persona<p>Generally what I do:<p>- Medium complexity: editing one file: o3. Editing multiple files: plan with o3, write with gemini-2.5<p>- Simple complexity: Editing many files, very simple: plan with o3 if needed, write with claude-3.7. Editing many files, simple, needs formulaic approach: write a detailed prompt into GPT 4.1<p>- High complexity: plan with o3, separate into multiple chunks, write small chunks at a time with gemini-2.5 and be very careful with each section. If I&#x27;m super lazy sometimes I just YOLO all of the sections and then fix all the bugs at the end but this probably leads to code issues later down the line.<p>Would love to hear other people are using the different models!

2 条评论

muzani2 天前
Sonnet 3.5 has a very different personality. It&#x27;s less skilled, but often I opt for it because of the personality.<p>Deepseek is actually pretty good and underappreciated too. It feels unreliable though. Downside is tool use, but I prefer it over o3.
joegibbs3 天前
I really like the code that Gemini 2.5 Pro writes but it tends to stop for no reason and needs to be reprompted to start again. I&#x27;m not sure why this is. Also, what&#x27;s the difference between 2.5 Pro and 2.5 Pro Max? Or Claude 3.7 and 3.7 Max?<p>Aside: it would be good for Cursor to add something to tell their agents not to run tool calls that run forever (like test watchers). I add this in my .mdc files but I think it would be a good default so that it can run tests, update the code, run them again until it works.
评论 #43912389 未加载