TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

DeepSeek v2.5 – open-source LLM comparable to GPT-4, but 95% less expensive

193 pointsby jchook7 months ago

20 comments

joshhart7 months ago
The benchmarks compare it favorably to GPT-4-turbo but not GPT-4o. The latest versions of GPT-4o are much higher in quality than GPT-4-turbo. The HN title here does not reflect what the article is saying.<p>That said the conclusion that it&#x27;s a good model for cheap is true. I just would be hesitant to say it&#x27;s a great model.
评论 #42000508 未加载
评论 #42003532 未加载
评论 #42000558 未加载
评论 #42004475 未加载
评论 #42003381 未加载
评论 #42002319 未加载
viraptor7 months ago
Why say comparable when gpt4o is not included in the comparison table? (Neither is the interesting Sonnet 3.5)<p>Here&#x27;s an Aider leaderboard with the interesting models included: <a href="https:&#x2F;&#x2F;aider.chat&#x2F;docs&#x2F;leaderboards&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aider.chat&#x2F;docs&#x2F;leaderboards&#x2F;</a> Strangely, v2.5 is below the old v2 Coder. Maybe we can count on v2.5 Coder being released then?
shamanic7 months ago
In my experience, Deepseek is my favourite model to use for coding tasks. it is not as smart of an assistant as 4o or Sonnet, but it has outstanding task adhesion, code quality is consistently top notch &amp; it is never lazy. unlike GPT4o or the new Sonnet (yuck) it doesn&#x27;t try to be too smart for its own good, which actually makes it easier to work with on projects. the main downside is that it has a problem with looping, where it gets some concept or context inside its context and refuses to move on from it. however if you remember the old GPT4 ( pre turbo ) days then this is really not a problem, just start a new chat.
uxhacker7 months ago
It’s interesting to see a Chinese LLM like DeepSeek enter the global stage, particularly given the backdrop of concerns over data security with other Chinese-owned platforms, like TikTok. The key question here is: if DeepSeek becomes widely adopted, will we see a similar wave of scrutiny over data privacy?<p>With TikTok, concerns arose partly because of its reach and the vast amount of personal information it collects. An LLM like DeepSeek would arguably have even more potential to gather sensitive data, especially as these models can learn from and remember interaction patterns, potentially accessing or “training” on sensitive information users might input without thinking.<p>The challenge is that we’re not yet certain how much data DeepSeek would retain and where it would be stored. For countries already wary of data leaving their borders or being accessible to foreign governments, we could see restrictions or monitoring mechanisms placed on similar LLMs—especially if companies start using these models in environments where proprietary information is involved.<p>In short, if DeepSeek or similar Chinese LLMs gain traction, it’s quite likely they’ll face the same level of scrutiny (or more) that we’ve seen with apps like TikTok.
评论 #42000450 未加载
评论 #42000518 未加载
评论 #42006272 未加载
jyap7 months ago
This 236B model came out around September 6th.<p>DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.<p>From: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;deepseek-ai&#x2F;DeepSeek-V2.5" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;deepseek-ai&#x2F;DeepSeek-V2.5</a>
评论 #42000041 未加载
TZubiri7 months ago
<a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=OW-reOkee1Y" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=OW-reOkee1Y</a> (sorry for the shitty source)<p>A word of advice on advertising low-cost alternatives.<p>&#x27;The weaknesses make your low cost believable. [..] If you launched Ryan Air and you said we are as good as British Airways but we are half the price, people would go &quot;it does not make sense&quot;&#x27;
khanan7 months ago
Did you try to ask it if Winnie the pooh look like the president of China?
评论 #42006435 未加载
zone4117 months ago
In my NYT Connections benchmark, it hasn&#x27;t performed well: <a href="https:&#x2F;&#x2F;github.com&#x2F;lechmazur&#x2F;nyt-connections&#x2F;">https:&#x2F;&#x2F;github.com&#x2F;lechmazur&#x2F;nyt-connections&#x2F;</a> (see the table).
DrPhish7 months ago
I run it at home at q8 on my dual Epyc server. I find it to be quite good, especially when you host it locally and are able to tweak all the settings to get the kind of results you need for a particular task.
评论 #42004648 未加载
gdevenyi7 months ago
What does open source mean here? Where&#x27;s the code? The weights?
patrickhogan17 months ago
It’s cheaper, but where do you get the initial free credits? It seems most models get such a boost and lock in from the initial free credits.
nextworddev7 months ago
Where are the servers hosted, and is there any proof that the data doesn’t cross overseas to China?
评论 #42004824 未加载
评论 #42002360 未加载
Alifatisk7 months ago
Oh wow, it almost beats Claude3 Opus!
ziofill7 months ago
What about comparisons to Claude 3.5? Sneaky.
BoNour7 months ago
not bad for a 250B model, would be more impressive if with more fine tunning it matches performance of gpt 4
evil_yam7 months ago
open model, not open-source model
nprateem7 months ago
As in significantly worse than..?
Giorgi7 months ago
In what world &quot;comparable&quot;, looks like another Chinese ChatGPT &quot;alternative&quot; that is a crap.
yieldcrv7 months ago
tl;dr not even close to closed source text-only modes, and a lightyear behind the other 3 senses these multimodal ones have had for a year<p>just a personal benchmark I follow, the UX on locally run stuff has diverged vastly
bionhoward7 months ago
Sadly it’s equally useless as OpenAI models because the terms of use read “ 3.6 You will not use the Services for the following improper purposes: 4) Using the Services to develop other products and services that are in competition with the Services (unless such restrictions are illegal under relevant legal norms).”<p>For the billionth time, there are zero products and services which are NOT in competition with general intelligence. Therefore, this kind of clause simply begs for malicious compliance…go use something else.