TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Generative AI and the big buzz about small language models

13 pointsby milliondreamsover 1 year ago

3 comments

milliondreamsover 1 year ago
As we see these systems evolving, I have come to believe specialist small language models with an MoE framework are the future of the industry.
swimwiththebeatover 1 year ago
Does anyone know if this is using the Mamba architecture[1] instead of transformers? It looks like it uses a state space model (SSM) layer.<p>[1]: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2312.00752" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2312.00752</a>
评论 #39587814 未加载
评论 #39568604 未加载
compressedgasover 1 year ago
Piece with less detail than the source linked from the article: <a href="https:&#x2F;&#x2F;www.together.ai&#x2F;blog&#x2F;stripedhyena-7b" rel="nofollow">https:&#x2F;&#x2F;www.together.ai&#x2F;blog&#x2F;stripedhyena-7b</a>