TechEcho

13 comments

dangover 1 year ago

It's against HN's guidelines to editorialize titles like this. From <a href="https://news.ycombinator.com/newsguidelines.html">https://news.ycombinator.com/newsguidelines.html</a>:"Please use the original title, unless it is misleading or linkbait; don't editorialize."If you want to say what you think is important about an article, that's fine, but do it by adding a comment to the thread. Then your view will be on a level playing field with everyone else's: <a href="https://hn.algolia.com/?dateRange=all&page=0&prefix=false&sort=byDate&type=comment&query=%22level%20playing%20field%22%20by:dang" rel="nofollow noreferrer">https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...</a>

评论 #38071695 未加载

WendyTheWillowover 1 year ago

I just want the 100k context window Anthropic gave Claude, but for GPT. Claude likes to hallucinate when I ask it to build chapter notes for a class I’m taking, and I don’t want to have to break up the text into tiny bits for GPT…

评论 #38070031 未加载

评论 #38070106 未加载

评论 #38070081 未加载

评论 #38070059 未加载

Flux159over 1 year ago

Link to the paper directly <a href="https://arxiv.org/pdf/2310.17680.pdf" rel="nofollow noreferrer">https://arxiv.org/pdf/2310.17680.pdf</a>The system that they describe, Codefusion is interesting because it's a diffusion model for generating code rather than an autoregressive model like most LLM code generators.

评论 #38069862 未加载

superkuhover 1 year ago

It says gpt3.5-turbo has 20B parameters, which I believe. But it says gpt3.5 (text-davinci-003) has 175B parameters, which I also believe.

behnamohover 1 year ago

This was expected because open source models of the same size already beat GPT-3.5 in many ways. And Mistral 7B makes you think if huge parameters are even needed for something like GPT-3.5 level.

评论 #38092572 未加载

nmstokerover 1 year ago

I could be wrong but I believe the original comment that's used as the title here comes from this tweet: <a href="https://twitter.com/felix_red_panda/status/1718916631512949248?t=jmCkeVH1Hyyu4vmwY-NhQg&s=19" rel="nofollow noreferrer">https://twitter.com/felix_red_panda/status/17189166315129492...</a>

sidcoolover 1 year ago

Didn't it leak earlier that it is 100 billion? And GPT 4 is 1.17 Trillion?

评论 #38070068 未加载

jsightover 1 year ago

That sounds incredible given how powerful the model is.

kaspermarstalover 1 year ago

Mhm, interesting development, the paper has been withdrawn.

leobgover 1 year ago

That would certainly explain the pricing (gpt-3.5 vs davinci).

samsepi0l121over 1 year ago

So it's possible to run gpt-3.5-turbo on a local machine?

Alifatiskover 1 year ago

Did openAi ever publicize how many params GPT-4 had?

bmitcover 1 year ago

Just eating up water and fossil fuels.

13 comments

dangover 1 year ago

评论 #38071695 未加载

WendyTheWillowover 1 year ago

评论 #38070031 未加载

评论 #38070106 未加载

评论 #38070081 未加载

评论 #38070059 未加载

Flux159over 1 year ago

评论 #38069862 未加载

superkuhover 1 year ago

It says gpt3.5-turbo has 20B parameters, which I believe. But it says gpt3.5 (text-davinci-003) has 175B parameters, which I also believe.

behnamohover 1 year ago

This was expected because open source models of the same size already beat GPT-3.5 in many ways. And Mistral 7B makes you think if huge parameters are even needed for something like GPT-3.5 level.

评论 #38092572 未加载

nmstokerover 1 year ago

sidcoolover 1 year ago

Didn't it leak earlier that it is 100 billion? And GPT 4 is 1.17 Trillion?

评论 #38070068 未加载

jsightover 1 year ago

That sounds incredible given how powerful the model is.

kaspermarstalover 1 year ago

Mhm, interesting development, the paper has been withdrawn.

leobgover 1 year ago

That would certainly explain the pricing (gpt-3.5 vs davinci).

samsepi0l121over 1 year ago

So it's possible to run gpt-3.5-turbo on a local machine?

Alifatiskover 1 year ago

Did openAi ever publicize how many params GPT-4 had?

bmitcover 1 year ago

Just eating up water and fossil fuels.

Microsoft says GPT 3.5 has 20B parameters?

13 comments

Microsoft says GPT 3.5 has 20B parameters?

13 comments