TE
테크에코
홈24시간 인기최신베스트질문쇼채용
GitHubTwitter
홈

테크에코

Next.js로 구축된 기술 뉴스 플랫폼으로 글로벌 기술 뉴스와 토론을 제공합니다.

GitHubTwitter

홈

홈최신베스트질문쇼채용

리소스

HackerNews API원본 HackerNewsNext.js

© 2025 테크에코. 모든 권리 보유.

Deepseek R1-0528

436 포인트작성자: error404x4일 전

13 comments

jacob0194일 전
Well that didn&#x27;t take long, available from 7 providers through openrouter.<p><a href="https:&#x2F;&#x2F;openrouter.ai&#x2F;deepseek&#x2F;deepseek-r1-0528&#x2F;providers" rel="nofollow">https:&#x2F;&#x2F;openrouter.ai&#x2F;deepseek&#x2F;deepseek-r1-0528&#x2F;providers</a><p>May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It&#x27;s 671B parameters in size, with 37B active in an inference pass.<p>Fully open-source model.
评论 #44121485 未加载
评论 #44122224 未加载
评论 #44121540 未加载
评论 #44129472 未加载
acheong084일 전
No information to be found about it. Hopefully we get benchmarks soon. Reminds me of the days when Mistral would just tweet a torrent magnet link
评论 #44120371 未加载
评论 #44120540 未加载
评论 #44120351 未加载
评论 #44119815 未加载
willchen4일 전
I love how Deepseek just casually drops new updates (that deliver big improvements) without fanfare.
评论 #44119474 未加载
评论 #44121994 未加载
评论 #44120423 未加载
评论 #44119508 未加载
评论 #44119234 未加载
评论 #44123641 未加载
评论 #44119745 未加载
transcriptase4일 전
Out of sheer curiosity: What’s required for the average Joe to use this, even at a glacial pace, in terms of hardware? Or is it even possible without using smart person magic to append enchanted numbers and make it smaller for us masses?
评论 #44119514 未加载
评论 #44119120 未加载
评论 #44119119 未加载
评论 #44120052 未加载
评论 #44119147 未加载
评论 #44119400 未加载
评论 #44119193 未加载
评论 #44121526 未加载
评论 #44119051 未加载
评论 #44119133 未加载
评论 #44120613 未加载
评论 #44120477 未加载
评论 #44130763 未加载
danielhanchen3일 전
For those interested, I made some 1 bit dynamic quants at <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;unsloth&#x2F;DeepSeek-R1-0528-GGUF" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;unsloth&#x2F;DeepSeek-R1-0528-GGUF</a><p>74% smaller 713GB to 185GB.<p>Use the magic incantation -ot &quot;.ffn_.*_exps.=CPU&quot; to offload MoE layers to RAM, allowing non MoEs to fit &lt; 24GB VRAM on 16K context! The rest sits in RAM &amp; disk.
karencarits4일 전
What use cases are people using local LLMs for? Have you created any practical tools that actually increase your efficiency? I&#x27;ve been experimenting a bit but find it hard to get inspiration for useful applications
评论 #44122600 未加载
评论 #44123008 未加载
评论 #44124303 未加载
评论 #44122684 未加载
评论 #44122643 未加载
评论 #44122746 未加载
jacob0194일 전
Not much to go off of here. I think the latest R1 release should be exciting. 685B parameters. No model card. Release notes? Changes? Context window? The original R1 has impressive output but really burns tokens to get there. Can&#x27;t wait to learn more!
deepsquirrelnet4일 전
I think it’s cool to see this kind of international participation in fierce tech competition. It’s exciting. It’s what I think capitalism should be.<p>This whole “building moats” and buying competitors fascination in the US has gotten boring, obvious and dull. The world benefits when companies struggle to be the best.
mjcohen4일 전
Deepseek seems to be one of the few LLMs that run on a iPod Touch because of the older version of ios.
评论 #44123094 未加载
评论 #44122951 未加载
AJAlabs4일 전
671B parameters! Well, it doesn&#x27;t look like I&#x27;ll be running that locally.
评论 #44122901 未加载
htrp4일 전
You&#x27;re gonna need at least 8 h100 80s for this....
评论 #44121347 未加载
cesarvarela4일 전
About half the price of o4 mini high for not that much worse performance, interesting<p>edit: most providers are offering a quantized version...
canergly4일 전
I want to see it in groq asap !
评论 #44120592 未加载