TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

AI chip startup Groq grabs the spotlight

9 点作者 makaimc大约 1 年前

1 comment

brucethemoose2大约 1 年前
Groq&#x27;s inference strategy appears to be &quot;SRAM only.&quot; There is no external memory, like GGDR or HBM. Instead, large models are split between networked cards, and the inputs&#x2F;outputs and pipelined.<p>This is a great idea... In theory. But it seems like the implementation (IMO) missed the mark.<p>They are using reticle size dies, running at high TDPs, at 1 die per card, with long wires running the interconnect.<p>A recent microsoft paper proposed a similar strategy, but with much more economical engineering. Instead, much smaller, cheaper SRAM heavy chips would be tiled across a motherboard, with no need for a power hungry long-range interconnect, no expensive dies on expensive PCIe cards. The interconnect is physically so much shorter and lower power by virtue of being <i>on</i> a motherboard.<p>In other words, I feel that Groq took an interesting inference strategy and ignored a big part of what makes it cool, packaging them like PCIe GPUs instead of tiled accelerators. Combined with the node disadvantage and compatibility disadvantage, I&#x27;m not sure how they can avoid falling into obscurity like Graphcore, which took a very similar SRAM heavy approach.
评论 #39738124 未加载