TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Claude 3.5 Sonnet. Does it really outperform GPT-4o?

4 点作者 datacog11 个月前
The new Sonnet model definitely kills GPT-4o in the published benchmarks. I evaluated it for real-world use cases and compared against GPT-4o. It did better in all the cases.<p>Test Case 1: Python Code Generation Write a script to generate email address from name and domain<p>Test Case 2: Web Page creation Create an HTML file that displays a simple personal portfolio webpage. The webpage should include a header with your name, a profile picture, a brief introduction about yourself, and a list of your skills. Use basic HTML tags to structure the content and include some inline CSS to style the elements<p>Test Case 3: API Query Generation Write a cURL to call dall-e-3 API, and generate image of a Unicorn with a rainbow horn<p>Assessment: - Sonnet provides a more direct response to the coding requests. When we asked for a cURL command, Claude directly gave that, whereas GPT-4o created a bash script. - The web page created by Claude was much more aesthetically pleasing, and almost readily usable. Great for non-tech folks who want to create web pages. - Python code generation: This one is hard to say, both perform well. GPT-4o needs a bit more detailed instructions. - Pricing: Claude is cheaper than GPT-4o ($3 per million input tokens vs $5 per million tokens for GPT-4o) - Speed: Claude is faster at generating the first token.<p>Here&#x27;s a detailed write up: https:&#x2F;&#x2F;blog.getbind.co&#x2F;2024&#x2F;06&#x2F;21&#x2F;claude-3-5-sonnet-does-it-outperform-gpt-4o&#x2F;

1 comment

throwaway888abc11 个月前
Clickable link: <a href="https:&#x2F;&#x2F;blog.getbind.co&#x2F;2024&#x2F;06&#x2F;21&#x2F;claude-3-5-sonnet-does-it-outperform-gpt-4o&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.getbind.co&#x2F;2024&#x2F;06&#x2F;21&#x2F;claude-3-5-sonnet-does-it...</a>
评论 #40772398 未加载