TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

DeepSeek-V2: A Strong, Economical, and Efficient Moe Language Model

14 点作者 jasondavies大约 1 年前

2 条评论

unraveller大约 1 年前
It's claiming to be llama3-70B tier in strength, 3x cheaper, 3-5x faster than it due to only having 21B out of 400B+ activated at any one time. With L3-70B normally costing <$1/Million.
bearjaws大约 1 年前
It&#x27;s performance at 21B parameters is very impressive.<p>I also like using something between 13 and 70B parameters, since it will run on a 32GB MacBook Pro easily.
评论 #40282145 未加载