TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Implementation of Google's Griffin Architecture – RNN LLM

218 pointsby milliondreamsabout 1 year ago

4 comments

VHRangerabout 1 year ago
Like RWKV and Mamba, this is mixing some RNN properties to avoid the issues transformers have.<p>However I&#x27;m curious about their scaling claims. They have a plot that shows how the model scales in training with the FLOPs you throw at it.<p>But the issue we should rather be concerned with is the wall time of training for a set amount of hardware.<p>Back in 2018, we could train medium sized RNNs, the issue was with wall time of training and training stability.
评论 #39994871 未加载
评论 #39994916 未加载
评论 #39995050 未加载
riku_ikiabout 1 year ago
I didn&#x27;t get one detail: they selected 6B transformer as baseline and compared it to 7B Griffin<p>Why wouldn&#x27;t select equal size models?..
评论 #39994681 未加载
janwasabout 1 year ago
For anyone interested in a C++ implementation, our github.com&#x2F;google&#x2F;gemma.cpp now supports this model.
评论 #39999864 未加载
spxneoabout 1 year ago
im not smart enough to know the significance of this...is Griffin like MAMBA?
评论 #39996525 未加载