41 pointsby DreamGenover 1 year ago

2 comments

They show that a decoder only transformer (which gpts are) are rnns with infinite hidden state size. Infinite hidden state size is a pretty strong thing! Sounds interesting to me.

评论 #39026638 未加载

Ickoover 1 year ago

I've seen at least 6 such papers, all being like "<popular architecture> are actually <a bit older concept>". Neural networks are generic enough that you can make them equivalent to almost everything.

评论 #39019193 未加载

评论 #39019211 未加载

评论 #39019068 未加载

评论 #39019147 未加载

Transformers Are Multi-State RNNs

2 comments

Transformers Are Multi-State RNNs

2 comments