TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GPT-4 architecture: what we can deduce from research literature

10 pointsby kir-gadjelloabout 2 years ago

3 comments

kir-gadjelloabout 2 years ago
As the discussion of GPT-4 heats up, the absence of details on its technical implementation becomes only more glaring. As an engineer, I have not learned anything applicable I haven&#x27;t known yesterday from the newest OpenAI publication!<p>I have been investigating issues of LLM training and inference for quite some time, and have developed a number of hypotheses about future SoTA models, which I believe very likely apply to GPT-4.
评论 #35161480 未加载
amrbabout 2 years ago
I&#x27;d like to know how it can support 32k when all the other models I&#x27;ve seen are 2-4k, does this mean it&#x27;s got a bigger layer for attention or it&#x27;s 4x billions of parameters Large?
评论 #35161858 未加载
seydorabout 2 years ago
Well if the model is so smart, could it be that it is actually aware of its layers and parameters?
评论 #35161430 未加载