90 pointsby joennlaeover 1 year ago

3 comments

cuuupidover 1 year ago

I think people are forgetting that transformer architectures are a wider field from GPT and predate GPT3 by 3+ years. Referring to transformer architectures using a branded commercial nomer (GPT) is just going to help cement OpenAI’s brand exposure and soon regulatory capture.<p>For comparison this would be like referring to convonets as Inception architectures back during the CV boom (or VGGnets before that)

评论 #38307668 未加载

评论 #38314428 未加载

评论 #38307907 未加载

评论 #38307565 未加载

评论 #38307736 未加载

评论 #38312050 未加载

评论 #38312030 未加载

gfaureover 1 year ago

Nice! The README mentions `LayerNorm` is implemented here, but while it's in the equivalence tests with PyTorch, I don't see it in the implementation.

评论 #38309767 未加载

p1eskover 1 year ago

I wonder how easy it would be to port this library from numpy to cupy.

评论 #38312145 未加载

评论 #38310283 未加载

Show HN: less than 650 LOC trainable GPT only using NumPy

3 comments

Show HN: less than 650 LOC trainable GPT only using NumPy

3 comments