TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Think Fast: Tensor Streaming Processor for Accelerating Deep Learning Workloads [pdf]

61 点作者 blopeur超过 4 年前

3 条评论

ZeroCool2u超过 4 年前
Though their primary testcase was just ResNet, at first glance the results here are encouraging. They claim a fairly staggering performance increase:<p>&quot;Compared to leading GPUs [42], [44], [59],the TSP architecture delivers 5×the computational density for deep learning ops. We see a direct speedup in real application performance as we demonstrate a nearly 4×speedup in batch-size-1 throughput and a nearly 4×reduction of inference latency compared to leading TPU, GPU, and Habana Lab’sGOYA chip.&quot;<p>It is challenging to directly compare a GPU vs an ASIC style chip like this. I would like to see more detailed performance comparisons vs something like Google&#x27;s TPU.
评论 #24664900 未加载
评论 #24665432 未加载
评论 #24664440 未加载
justicezyx超过 4 年前
I am guessing that groq did the wrong thing here.<p>To my eyes, deep learning asics generally are only meaningful in 2 separate scenarios: a high power high scale data center training chip; or a low power highly efficient edge inference chip.<p>TSP appears a throughput oriented high power inference chip. I don&#x27;t know any decent size market can support such chip from a start-up.
评论 #24666050 未加载
person_of_color超过 4 年前
Why did Wave Computing fail?