TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Kaist develops next-generation ultra-low power LLM accelerator

97 点作者 readline_prompt大约 1 年前

7 条评论

geuis大约 1 年前
Want to reference Groq.com. They are developing their own inference hardware called an LPU <a href="https:&#x2F;&#x2F;wow.groq.com&#x2F;lpu-inference-engine&#x2F;" rel="nofollow">https:&#x2F;&#x2F;wow.groq.com&#x2F;lpu-inference-engine&#x2F;</a><p>They also released their API a week or 2 ago. Its <i>significantly</i> faster than anything from OpenAI right now. Mixtral 8x7b operates at around 500 tokens per second. <a href="https:&#x2F;&#x2F;groq.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;groq.com&#x2F;</a>
评论 #39634050 未加载
moffkalast大约 1 年前
&gt; The 4.5-mm-square chip, developed using Korean tech giant Samsung Electronics Co.&#x27;s 28 nanometer process, has 625 times less power consumption compared with global AI chip giant Nvidia&#x27;s A-100 GPU, which requires 250 watts of power to process LLMs, the ministry explained.<p>&gt;processes GPT-2 with an ultra-low power consumption of 400 milliwatts and a high speed of 0.4 seconds<p>Not sure what&#x27;s the point on comparing the two, an A100 will get you a lot more speed than 2.5 tokens&#x2F;sec. GPT 2 is just a 1.5B param model, a Pi 4 would get you more tokens per second with just CPU inference.<p>Still, I&#x27;m sure there&#x27;s improvements to be made and the direction is fantastic to see, especially after Coral TPUs have proven completely useless for LLM and whisper acceleration. Hopefully it ends up as something vaguely affordable.
评论 #39634536 未加载
zachbee大约 1 年前
Neuromorphic computing is cool, but not new tech. However, using a neuromorphic spiking architecture to run LLMs seems new. Unfortunately, there doesn&#x27;t seem to be a paper associated with this work, so there&#x27;s no deeper information on what exactly they&#x27;re doing.
评论 #39638811 未加载
评论 #39655624 未加载
dartos大约 1 年前
&gt; New structure mimics the layout of neurons and synapses<p>What does that mean, practically?<p>How can you mimic that layout in silicon?
评论 #39636027 未加载
评论 #39633090 未加载
PeterStuer大约 1 年前
Quick shoutout to <a href="https:&#x2F;&#x2F;youtube.com&#x2F;@TechTechPotato" rel="nofollow">https:&#x2F;&#x2F;youtube.com&#x2F;@TechTechPotato</a> for those interested in keeping tabs on the AI hardware space. There is much more going on in this area than you would think if you only follow general media.
bglazer大约 1 年前
The article says 400 milliwatt power draw.<p>Wolfram Alpha says thats roughly equivalent to cell phone power draw when sleeping.
pavelstoev大约 1 年前
We build software acceleration for LLM, effectively running smaller llama2 models at the same performance on several L4&#x27;s as on 1xA100.