TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

GPT-4-turbo preliminary benchmark results on code-editing

74 点作者 heliophobicdude超过 1 年前

10 条评论

exo-pla-net超过 1 年前
So it appears that GPT-4-Turbo is indeed (at least marginally) smarter than the previous GPT-4, just as Altman claimed. Also, it's faster and cheaper, with a massive context window. Exciting!
评论 #38185426 未加载
评论 #38185230 未加载
评论 #38185074 未加载
评论 #38185125 未加载
评论 #38185179 未加载
jpdus超过 1 年前
For other (non-code) benchmarks, people are having the opposite experience:<p>&quot;I benchmarked on SAT reading, which is a nice human reference for reasoning ability. Took 3 sections (67 questions) from an official 2008-2009 test (2400 scale) and got the following results, here a SAT-like test:<p>- GPT3.5 - 690 (10 wrong) - GPT4 - 770 (3 wrong) - GPT4-turbo (one section at time) - 740 (5 wrong) - GPT4-turbo (3 sections at once, 9K tokens) - 730 (6 wrong)&quot;<p>Source: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;wangzjeff&#x2F;status&#x2F;1721934560919994823?t=PcAm8yVbU_odyqK9e53MAA&amp;s=19" rel="nofollow noreferrer">https:&#x2F;&#x2F;twitter.com&#x2F;wangzjeff&#x2F;status&#x2F;1721934560919994823?t=P...</a>
评论 #38185773 未加载
评论 #38185789 未加载
评论 #38185405 未加载
评论 #38186495 未加载
xeckr超过 1 年前
Back in April it would only generate a handful of tokens per second. The speed improvements for GPT-4 are staggering. I wonder how much of it is because Microsoft is making GPUs rain on OpenAI, and how much of it is due to improvements to the model and its scaffolding.
评论 #38185143 未加载
评论 #38184943 未加载
meiraleal超过 1 年前
The past days ChatGPT went from a great pair programming helper to a useless antipathetic intern, the quality of generated code dropped visibly. The context seems to be bigger in the chatgpt plus version too but it got dumber.
cloudking超过 1 年前
Has anyone been able to access the 128k context window? I&#x27;m not seeing that option in the API playground
评论 #38185001 未加载
评论 #38185004 未加载
评论 #38185324 未加载
Racing0461超过 1 年前
reddit thread on the opposite experience - <a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;ChatGPT&#x2F;comments&#x2F;17prwlg&#x2F;gpt4_turbo_is_unusable_for_coding_and_various&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;ChatGPT&#x2F;comments&#x2F;17prwlg&#x2F;gpt4_turbo...</a>
评论 #38185332 未加载
评论 #38185567 未加载
评论 #38185383 未加载
评论 #38185376 未加载
ttul超过 1 年前
The progress here is remarkable. A year ago, we didn’t even have ChatGPT. LLM completions were cool but so hard to use and definitely there was nothing accessible to non-nerds.
kristianp超过 1 年前
Aider sounds like a cool tool, I&#x27;ll have to try it out. I&#x27;m assuming it makes use of your local files and edits them for you?<p>Are there any other programming assistant packages that use the chatgpt api like this?<p>Regarding rate limits, it might be an idea to have configurable delays built in to the testing code to prevent hitting limits.
Racing0461超过 1 年前
Is this just the api or does it work on chatgpt also?
评论 #38185274 未加载
评论 #38185254 未加载
vouaobrasil超过 1 年前
Programmers here seem excited about the potential of this new version...but I can&#x27;t help but wonder at how naive this attitude really is. Even if AI never becomes intelligent like us, if it can emulate this intelligence in enough domains, then it has a serious chance of being dangerous. It&#x27;s already pretty much guaranteed that it will put almost everyone out of a job, turning the vast majority of humans into content-consuming sloths.<p>Does it really make sense to play with this kind of power?
评论 #38185582 未加载
评论 #38185558 未加载
评论 #38185479 未加载
评论 #38185656 未加载