TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Microsoft says GPT 3.5 has 20B parameters?

57 点作者 Heidaradar超过 1 年前

13 条评论

dang超过 1 年前
It&#x27;s against HN&#x27;s guidelines to editorialize titles like this. From <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;newsguidelines.html">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;newsguidelines.html</a>:<p>&quot;<i>Please use the original title, unless it is misleading or linkbait; don&#x27;t editorialize.</i>&quot;<p>If you want to say what you think is important about an article, that&#x27;s fine, but do it by adding a comment to the thread. Then your view will be on a level playing field with everyone else&#x27;s: <a href="https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=false&amp;sort=byDate&amp;type=comment&amp;query=%22level%20playing%20field%22%20by:dang" rel="nofollow noreferrer">https:&#x2F;&#x2F;hn.algolia.com&#x2F;?dateRange=all&amp;page=0&amp;prefix=false&amp;so...</a>
评论 #38071695 未加载
WendyTheWillow超过 1 年前
I just want the 100k context window Anthropic gave Claude, but for GPT. Claude likes to hallucinate when I ask it to build chapter notes for a class I’m taking, and I don’t want to have to break up the text into tiny bits for GPT…
评论 #38070031 未加载
评论 #38070106 未加载
评论 #38070081 未加载
评论 #38070059 未加载
Flux159超过 1 年前
Link to the paper directly <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2310.17680.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;2310.17680.pdf</a><p>The system that they describe, Codefusion is interesting because it&#x27;s a diffusion model for generating code rather than an autoregressive model like most LLM code generators.
评论 #38069862 未加载
superkuh超过 1 年前
It says gpt3.5-turbo has 20B parameters, which I believe. But it says gpt3.5 (text-davinci-003) has 175B parameters, which I also believe.
behnamoh超过 1 年前
This was expected because open source models of the same size already beat GPT-3.5 in many ways. And Mistral 7B makes you think if huge parameters are even needed for something like GPT-3.5 level.
评论 #38092572 未加载
nmstoker超过 1 年前
I could be wrong but I believe the original comment that&#x27;s used as the title here comes from this tweet: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;felix_red_panda&#x2F;status&#x2F;1718916631512949248?t=jmCkeVH1Hyyu4vmwY-NhQg&amp;s=19" rel="nofollow noreferrer">https:&#x2F;&#x2F;twitter.com&#x2F;felix_red_panda&#x2F;status&#x2F;17189166315129492...</a>
sidcool超过 1 年前
Didn&#x27;t it leak earlier that it is 100 billion? And GPT 4 is 1.17 Trillion?
评论 #38070068 未加载
jsight超过 1 年前
That sounds incredible given how powerful the model is.
kaspermarstal超过 1 年前
Mhm, interesting development, the paper has been withdrawn.
leobg超过 1 年前
That would certainly explain the pricing (gpt-3.5 vs davinci).
samsepi0l121超过 1 年前
So it&#x27;s possible to run gpt-3.5-turbo on a local machine?
Alifatisk超过 1 年前
Did openAi ever publicize how many params GPT-4 had?
bmitc超过 1 年前
Just eating up water and fossil fuels.