Why DeepSeek had to be open source

525 点作者 AnhTho_FR3 个月前

46 条评论

tim3333 个月前

The article says it had to be open source because otherwise people wouldn't trust the Chinese but ByteDance, Tencent, Baidu, and Alibaba also do LLMs and are not open source.It's funny reading an article interviewing the ceo:>Until now, among the seven major Chinese large-model startups, it’s the only one... that hasn’t fully considered commercialization, firmly choosing the open-source route without even raising capital.>While these choices often leave it in obscurity, DeepSeek frequently gains organic user promotion within the community.The obscurity thing hasn't lasted! (article nov 2024 <a href="https://www.chinatalk.media/p/deepseek-ceo-interview-with-chinas" rel="nofollow">https://www.chinatalk.media/p/deepseek-ceo-interview-with-ch...</a>)The ceo's actual argument for open source is quite interesting, basically that it helps attract the best people and the value is in the team. It's kind of what used to work for OpenAI before it became the ClosedAI division of Microsoft.

评论 #42880446 未加载

lacoolj3 个月前

> A Chinese AI API would likely receive skepticism in the West"Would likely.."? No, it definitely does, and should, for historically good reason. Anyone using this should be doing so with enough grains of salt to fill SLC<a href="https://www.euronews.com/next/2025/01/28/chinese-ai-deepseek-censors-sensitive-questions-on-china-when-compared-to-rivals-like-chat" rel="nofollow">https://www.euronews.com/next/2025/01/28/chinese-ai-deepseek...</a> <a href="https://www.theguardian.com/technology/2025/jan/28/we-tried-out-deepseek-it-works-well-until-we-asked-it-about-tiananmen-square-and-taiwan" rel="nofollow">https://www.theguardian.com/technology/2025/jan/28/we-tried-...</a>

评论 #42871049 未加载

评论 #42867849 未加载

feverzsj3 个月前

Didn't they only "opensource" weights like others?

评论 #42866701 未加载

评论 #42866669 未加载

评论 #42867586 未加载

评论 #42866686 未加载

评论 #42871901 未加载

评论 #42866684 未加载

评论 #42871126 未加载

评论 #42869569 未加载

评论 #42866961 未加载

garspin3 个月前

An alternative explaination...Deepseek is a side project for a hedge fund.Shorting NVIDIA & releasing everything including the source would have a high probability of being hugely profitable, with almost zero downside if it went unnoticed.

评论 #42876905 未加载

评论 #42875874 未加载

tgtweak3 个月前

Wasn't there an internal google email or memo that stated as much as well? That open source was moving faster and more efficiently than the best private teams and that it was accelerating - basically calling this out about 18 months early?[1] <a href="https://www.artisana.ai/articles/leaked-google-memo-claiming-we-have-no-moat-and-neither-does-openai-shakes" rel="nofollow">https://www.artisana.ai/articles/leaked-google-memo-claiming...</a>

评论 #42868921 未加载

jsemrau3 个月前

The future of LLMs is shared research and that's the part I really like. It's ok, in my opinion, if not everything is shared, but this is too important to be in one company.

zbshqoa3 个月前

That shouldn't be the premise of a company that has "open" in its name as well?

评论 #42866742 未加载

kidsil3 个月前

Linux won in the long run, I don't see why robust LLM models won't do the same.In the end it'll be the scale of the infrastructure itself that will make the difference.

评论 #42868573 未加载

mirawelner3 个月前

I think at the end of the day the reason that the opensourced DeepSeek is because they are programmers. Programmers like to show people the cool stuff they did. I had a boss who was rich enough to retire but was working three jobs because programming is cool and fun and he wants to do cool and fun things and show people the stuff that he did.Everybody is trying to come up with a money related reason for why they open sourced it but at the end of the day the people who made it are engineers and not buisnesspeople. DeepSeek is really freaking cool, and they wanted to show people the cool thing they did.

评论 #42867758 未加载

评论 #42867341 未加载

评论 #42869020 未加载

hhthrowaway12303 个月前

You cant reproduce this model from the source, because the source isnt given, the result is given. hence not open source

评论 #42869318 未加载

BoorishBears3 个月前

a) As soon as I saw the domain I knew this was an ad (Lago has nothing to add to this conversation)b) DeepSeek is the most dangerous thing that's happened to Open Source models in recent memory, through no fault of their own.The hysteria has outrun the reality and now there's a going to be a similarly disproportionate backlash.It's already happening: Anthropic's CEO simultaneously railing against what they achieved and using it to justify stronger export restrictions, this morning.And our current government doesn't want to be going on stage talking about $50B mega projects only for laypeople to (mistakenly) believe it only takes a few million to do the same.And the idea that a Chinese company is the one that did this is going to play into so many hands, so perfectly. You can see the censorship story start taking the narrative despite this not being the first or last Chinese hosted model to comply with Chinese law.Soon the national security angle will break out, especially if someone jailbreaks or abliterates it and gets "harmful outputs" that other models would also happily produce.Some will couch the (very temporary and irrational) dip the market faced as a Chinese company managing to harm our markets by providing an unfairly priced product or some nonsense.Open source AI is not guaranteed. We might still see protectionist bans against releasing models over a certain size and other irrational nonsense, and this has played into the kind of hysteria that allows that to happen.

soheil3 个月前

Why are people so willing to believe false proofs/headlines? Clickbait has existed for decades yet I still believe people are as gullible as the first day. Articles like this and from sites like phys.org are great examples of the case in point they regularly get hundreds of upvotes based on completely ridiculous and false promises.Always been fascinating to me how often rhetoric wins over substance on hn.

kodzoman3 个月前

Lago doesn't seems to be really open source since it doesn't even support basic features like credit notes in the free version.

serverlessmania3 个月前

It's not open source, we have no idea about the data used to train the model, and the paper doesn't explain it all.

评论 #42870675 未加载

apples_oranges3 个月前

“We Have No Moat, And Neither Does OpenAI” - Google

评论 #42866811 未加载

评论 #42866580 未加载

评论 #42866544 未加载

bufferoverflow3 个月前

And who will pay for all the expensive AI hardware? We're getting into the crazy phase of hundred billion dollar data centers.Just because R1 was trained cheaply, doesn't mean that this architecture cannot be trained on a very expensive data center to get much better and bigger models.

评论 #42869357 未加载

tempeler3 个月前

Some benefits of US sanctions. The only way for China to spread in the West or non-West is to be open source. US monopolies may prevent competition in the US, but I don't think other countries would be willing to join in. They are actually doing a favor for the rest of the world. Will open source become cheap hardware thanks to the hardware wars? Are they trying to make everyone dependent on China by being paranoid like this? What are they doing? The US is going in the wrong direction. The inefficient ones should say goodbye to the market and continue with the efficient ones. They seem to be going after the monopolies and bringing about their own end.

nokun73 个月前

While open-source LLMs offer transparency and community-driven innovation, the future might not be exclusively OSS. Proprietary models have significant advantages, including the ability to secure investment for cutting-edge development, customize for specific business needs, and maintain competitive edges through secrecy. Moreover, companies can directly monetize proprietary models, providing a clear path to profitability, and they can offer enhanced security and privacy controls crucial for sensitive applications. Thus, both open-source and proprietary LLMs are likely to continue playing vital roles in AI's future landscape.

评论 #42867165 未加载

9cb14c1ec03 个月前

I am running Deepseek R1 on my AMD Ryzen 7 PRO 5850U integrated GPU. While my experience will R1 doesn't make me think well of it, it is impressive how fast it is on such a weak graphics processor.

评论 #42867259 未加载

评论 #42867282 未加载

评论 #42867817 未加载

herval3 个月前

DeepSeek's gambit proves that as much as Stable Diffusion proved that the future of Diffusion Models is open-source. In other words, it doesn't prove anything

basileafe3 个月前

Remember when Open AI CTO squirms In response to a question about using data from YouTube? <a href="https://digg.com/digg-vids/link/open-ai-ceo" rel="nofollow">https://digg.com/digg-vids/link/open-ai-ceo</a>OpenAI's CTO, Mira Murati, found herself in a tight spot when questioned about using YouTube data to train Sora. Her uncertain response has sparked controversy and raised concerns about their ethics in collecting and training data. This incident has fueled a growing debate about AI companies' data practices.Then YouTube's CEO, Neal Mohan said, if OpenAI used YouTube content without permission, it would violate their terms of service. Shall Neal freakout like how they are now!! Clearly they are scared, they know people are canceling their subscriptions with them to and use free and better technologies. I know of 100 of people canceled their gpt subscription. Many developers are replacing the expensive gpt models for free deepseek.Here is the AI current story:Imagine two AI trains chugging along the tracks of innovation. The first, driven by OpenAI, was the early leader, after they using Google transformers (and without they wouldn't exist). They charged a hefty fare for anyone to hop aboard. We don't know how they trained their data. And big companies felt they had to buy tickets or risk being left behind. OpenAI thought they were the only engine in town. But then, another train pulled up alongside them. This new locomotive, powered by smart folks at DeepSeek, matched OpenAI's speed and fancy gadgets, if not better. The kicker? Everyone could ride for free!Now, OpenAI's train is losing steam. People are jumping ship, with hundreds canceling their pricey GPT subscriptions. Meanwhile, the free train is picking up speed, aiming to make AI available to all.In this tale of two trains, OpenAI might need to change their name to "ClosedAI" if they keep putting up barriers, being closed. The free and open train? That's the one chugging towards a brighter, better, free AI future for everyone.deepseek = Open AI

评论 #42869545 未加载

评论 #42870443 未加载

评论 #42869968 未加载

评论 #42870384 未加载

rvz3 个月前

As too easily predicted. [0][1]Frontier AI model SaaS companies like OpenAI can never win the race to zero against $0 free or open source AI models as they are already at the finish line.[0] <a href="https://news.ycombinator.com/item?id=35177606">https://news.ycombinator.com/item?id=35177606</a>[1] <a href="https://news.ycombinator.com/item?id=35661548">https://news.ycombinator.com/item?id=35661548</a>

whatever13 个月前

Open sourcing Llama just ensured that openAI will not create a dominant ecosystem that will attract most of the organic web traffic.METAs bet paid off, but at what cost.

评论 #42867318 未加载

aprilfoo3 个月前

The current AI mega-buzz, fueled by fascinating technologies, finance and even geopolitics makes it difficult to have a serious analyze beyond opinions and reactions. But the shock waves of that announcement by a small tech company are quite interesting.> In fact, making it easier and cheaper to build LLMs would erode their [OpenAI, Meta, Google etc] advantages!The narrative until now was: AI requires enormous and cutting edge resources (money, energy), so only for the big boys and people who can talk multi-billions investments, so open source was not an option.Some signs already appeared recently (plateau, bubble?), and Deepseek seems to show that this model is questionable.

hsuduebc23 个月前

Nothing was proven. It's just an empty statement. I would guess that future llms would be largely based on these which are open sourced today but products which would be most usef would be held proptietary. For end user is main argument convenience and ease of use.Exactly how it happened in operation systems.

jaharios3 个月前

As I see it, what most big players are hunting is new data. Gaining trust is important, being the new thing big thing is also good. Being seen as "open" (not open source) makes other think you have nothing to hide and good intentions.

bityard3 个月前

> but trained on inferior hardware for a fraction of the priceDo we know that this is actually true?

CooCooCaCha3 个月前

Yes and no. Intelligence scaling with compute makes sense so I doubt the advantage of closed models on large compute clusters will ever truly go away.But that doesn’t mean smaller models aren’t useful.

mmaunder3 个月前

The argument re OpenAI continuing to lead falls flat when you consider the talent they've lost. It's a different company compared to the one that built and launched GPT-4.

SathyaQuikFlip3 个月前

DeepSeek shows proof that all models can be equally as good as each other, and that the best models will eventually be open-source. I believe it's a good thing for our world.

varsketiz3 个月前

Sorry for possibly a stupid question, but what is the license for commercial use? If I want to run R1 in my DC, build a product on top and charge people for it. Is it MIT?

评论 #42867430 未加载

liminal3 个月前

I'd love to see the training data open sourced for all models so we can be sure no copyright material has been used. Just kidding, we all know it's stolen.

评论 #42869321 未加载

ge963 个月前

Tin foil hat, anyone run it and use wireshark to see if it doesn't make external requests (unless it had to like a browser agent)

评论 #42866567 未加载

评论 #42867077 未加载

评论 #42866889 未加载

评论 #42866759 未加载

评论 #42866726 未加载

alecco3 个月前

This is quite bad blogspam appealing to the open source crowd. They didn't even bother to read a bit.Deepseek is open source because the founders are part of the new generation of Chinese graduates who relate more to the global youth than Boomer Chinese CEOs completely out of touch. And right on time because CCP is fed up with them, too.Last week Deepseek founder Liang Wenfeng was speaking practically face to face with Chinese Premier Li Qiang at a symposium: <a href="https://www.youtube.com/watch?v=zMyc3vhpLyI" rel="nofollow">https://www.youtube.com/watch?v=zMyc3vhpLyI</a>. And they seem to be quite aligned.Why didn't this blogspam of an article pick up on any of that?<a href="https://news.ycombinator.com/item?id=42852266">https://news.ycombinator.com/item?id=42852266</a>

swyx3 个月前

meta question: hey Anh! how come you stopped blogging on your github? i thought that was working for you.

评论 #42867085 未加载

titzer3 个月前

I, for one, abhor the idea of megacorps running models and AI as a service as they do now. If nothing else, the internet proved to us that an absolute gold mine of technological value can and will be enshittified to the point of unusability when it is cornered by Big Tech. I shudder to think of models trained specifically to convince people to buy things--and I am looking directly at Big Tech's advertising model as one of the worst possible incubators for this technology.Don't forget to drink your Ovaltine.

1970-01-013 个月前

>Does that mean proprietary AI is done? No.Perfectly stated.The AI jump to conclusions mat is so worn down, it's become paper thin. The shock of DeepSeek's costs does not auto-magically force all LLMs to become opensource. Silicon Valley tech has always favored whomever delivers inside the trifecta of cheaper, better, faster triangle. Anyone with an MBA should know this includes open-source LLMs. As of today, DeepSeek is ahead. As soon as OpenAI answers with a new 'fastfood dollar menu' for ChatGPT, with 'even more special' secret-sauce ingredients, we're going to see them back to normal business.

shahzaibmushtaq3 个月前

It's DeepSeek low-price, low-investment reasoning models that has sent a shockwave around the world.> Compare $60 per million output tokens for OpenAI o1 to $7 per million output tokens on Together AI for DeepSeek R1.Open-source isn't a primary rational aspect to prove anything. We can't even prove what's going to happen tomorrow, proving a statement that is linked to the future is utter nonsense.

chrchr3 个月前

As long as entrenched Google, Meta and the Chinese Communist Party can use Open Source LLMs to kneecap upstart rivals, I agree. Once the upstarts are neutralized, the open LLMs will stop.

niyyou3 个月前

Again. This. is. not. Open-source. At best, open-weights. Clickbait 100%.

fuddle3 个月前

This looks like another LLM generated article.

gnarlouse3 个月前

Yeah no, it makes way more sense that it's an attack by the chinese government on the US economy.

prjkt3 个月前

Source:- Training SW [x]- Inference SW [x]- Evaluation SW [x]- Data [x]Output:- Weights []DeepSeek is closed-source with *open-weights*

评论 #42868364 未加载

danjl3 个月前

Click bait headline. Nothing was proven about open source as the future. "To gain a foothold in Western markets, DeepSeek had to open-source its models." This is an opinion, not the only solution. DeepSeek could have remained proprietary just as easily. The bias of the author becomes clear at the end when he starts to promote his own open source company. Everyone has an agenda.

评论 #42869388 未加载

评论 #42866916 未加载

评论 #42867247 未加载

评论 #42868828 未加载

评论 #42868155 未加载

评论 #42868519 未加载

评论 #42866871 未加载

评论 #42867191 未加载

评论 #42866888 未加载

评论 #42869204 未加载

评论 #42867599 未加载

DrBenCarson3 个月前

DEEPSEEK IS NOT OPEN SOURCE, THEY JUST PUBLISHED THE WEIGHTS

评论 #42869396 未加载

评论 #42868533 未加载

评论 #42868866 未加载

评论 #42868190 未加载

评论 #42868763 未加载

评论 #42868487 未加载

评论 #42871484 未加载

pointedAt3 个月前

wait, this ain't another whitelabel OpenAI ChatGPT-oOPs cosplay?

评论 #42867014 未加载