10 pointsby kir-gadjelloabout 2 years ago

3 comments

As the discussion of GPT-4 heats up, the absence of details on its technical implementation becomes only more glaring. As an engineer, I have not learned anything applicable I haven't known yesterday from the newest OpenAI publication!<p>I have been investigating issues of LLM training and inference for quite some time, and have developed a number of hypotheses about future SoTA models, which I believe very likely apply to GPT-4.

评论 #35161480 未加载

amrbabout 2 years ago

I'd like to know how it can support 32k when all the other models I've seen are 2-4k, does this mean it's got a bigger layer for attention or it's 4x billions of parameters Large?

评论 #35161858 未加载

seydorabout 2 years ago

Well if the model is so smart, could it be that it is actually aware of its layers and parameters?

评论 #35161430 未加载

GPT-4 architecture: what we can deduce from research literature

3 comments

GPT-4 architecture: what we can deduce from research literature

3 comments