TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Claude 3.5 Sonnet vs. o1 vs. <other> for coding. Let's talk!

9 点作者 notnotrishi6 个月前
Both o1 (mini&#x2F;preview) and Claude 3.5 Sonnet seem to be popular among devs, but opinions seem to be divided + all over the place. From my experience, both seem to have their strengths and weaknesses, and I find myself switching between both.<p>If you’ve used either — or ideally both — would love to hear your insights. Feel answers to the following questions will provide some context when you respond:<p>- What are the strengths &amp; weaknesses of each from your experience?<p>- Any tips&#x2F;tricks or prompting techniques you use to get the most from these models?<p>- How do you typically use them? (via. native apps like ChatGPT, Claude; or via Cursor, GitHub Copilot, etc.?)<p>- What programming language(s) do you primarily use them with?<p>Hopefully this thread provides a useful summary and some additional tips for readers.<p>(I’ll start with mine in the comments)

9 条评论

sdrinf6 个月前
O1 for collabing on design docs, o1 for overall structure, break it into tasks per preference &#x2F; sort; sonnet&#x2F;o1 for executing each small tasks.<p>O1 is higher quality, more nuanced, and has deeper understanding; the biggest downside rn is the significantly higher latency (both due to thinking, and also, continue.dev doesn&#x27;t support o1 streaming currently, so you&#x27;re waiting until it&#x27;s all done), and higher cost.<p>In terms of tools: either vscode with continue.dev &#x2F; cline, or cursor<p>Languages: node.js &#x2F; javascript, and lately c# &#x2F; .net &#x2F; unity
评论 #42247277 未加载
dauertewigkeit6 个月前
I prefer o1. I mostly use it as a knowledge system. Don&#x27;t really care for the automatic code generation nonsense. Unless I&#x27;m really tired and the task is very simple, in which case I might decide to write a paragraph of text instead of 30 lines of Python. My experience is that when ChatGPT fails, Claude fails too. On some advanced coding tasks, I find ChatGPT&#x27;s depth of reasoning ability to be better.
评论 #42246505 未加载
评论 #42247287 未加载
pizza6 个月前
o1:<p>- better for when the response has to address many subgoals coherently<p>- usually will not undo previous bugfix progress that was made earlier in the conversation, whereas with Claude if you start having extremely long conversations I have noticed it allowing certain bugs it had already fixed to be reintroduced at much later times<p>Claude:<p>- image inputs are actually very complementary for debugging issues, esp if visual at all (eg debugging why a GUI framework rendered your UI in an unexpected way, just include a screenshot)<p>- surprisingly very good at taking descriptions of algorithmic or mathematical procedures and making captioned svg illustrations, then taking screenshots of those svgs + user feedback to enhance the next version of svg illustrations<p>- more recent knowledge cutoff, so generally speaking somewhat less likely to deny newer APIs&#x2F;things exist (eg o1 told me tokenizer.apply_chat_template and meta-llama&#x2F;Llama-3.2-1B-Instruct both did not exist and removed them both from the code I was feeding it)
评论 #42247262 未加载
notnotrishi6 个月前
My notes:<p>- Sonnet 3.5 seems good with code generation and o1-preview seems good with debugging<p>- Sonnet 3.5 struggles with long contexts whereas o1-preview seems good at identifying interdependencies between files in code repo in answering complex questions<p>- Breaking the problem into small steps seems to yield better results with Sonnet<p>- I’m using primarily in Cursor&#x2F;GH Copilot and with Python
评论 #42226863 未加载
Imanari6 个月前
I like aider with the claude-3-5-sonnet-20241022, haven’t tried it with O1, though.<p>Also, <a href="https:&#x2F;&#x2F;aider.chat&#x2F;docs&#x2F;scripting.html" rel="nofollow">https:&#x2F;&#x2F;aider.chat&#x2F;docs&#x2F;scripting.html</a> offers some nice possibilities.
评论 #42247282 未加载
KingOfCoders6 个月前
Started a small project to compare AI IDEs<p><a href="https:&#x2F;&#x2F;github.com&#x2F;StephanSchmidt&#x2F;ai-coding-comparison&#x2F;">https:&#x2F;&#x2F;github.com&#x2F;StephanSchmidt&#x2F;ai-coding-comparison&#x2F;</a><p>(no comparison there yet, just some code to play around with)
评论 #42247303 未加载
MeetingsBrowser6 个月前
Are there any concrete benchmarks for comparing models for different types of programming tasks?
评论 #42247325 未加载
muzani6 个月前
o1 if you&#x27;re going to write full specs and not provide any context.<p>Sonnet 3.5 if you can provide context (e.g. with cursor)<p>gpt-4o for UI design. Also for solving screenshots of interviews
评论 #42247315 未加载
ldjkfkdsjnv6 个月前
o1 is much better at finding complex needle in the haystack bugs&#x2F;fixes. sonnet 3.5 better at shallow generic coding
评论 #42247317 未加载