TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

OpenAI says it's "impossible" to create AI models without copyrighted material

69 点作者 freeqaz超过 1 年前

21 条评论

Conasg超过 1 年前
The biggest problem as I see it is that they monetized it, on a massive scale. If this had been open source like they promised, it would seem more like fair use. But to take huge amounts of copyrighted material without consent and effectively sell access to that material? I struggle to see how that could ever be legal.
评论 #38934525 未加载
评论 #38934587 未加载
achrono超过 1 年前
So much for all the altruism, and with this kind of an argument they really are scraping the bottom of the barrel -- it isn't the first time a company is lying to a government committee, it isn't the first time a company has claimed to be moral perfection incarnate only to turn out to be a non-repentant defiler, so we should all be prepared to expect worse going forward from OpenAI.
评论 #38933430 未加载
评论 #38935089 未加载
zgs超过 1 年前
Bollocks. There is lots of out of copyright content available. It just requires a person in the loop to assess the status.<p>If it truely were impossible, then it sounds like they&#x27;ve just admitted that they should have licensed the content rather than using it without permission.
评论 #38933955 未加载
评论 #38933544 未加载
评论 #38934639 未加载
sensanaty超过 1 年前
Aw the poor little entity backed by the comically evil trillion dollar megacorporation :(<p>If they can&#x27;t manage to strike out deals with content creators despite having M$&#x27; coffers at their disposal, then they should cease to exist. Let&#x27;s also nuke M$ from orbit and split them up into a trillion little pieces while we&#x27;re at it
评论 #38935014 未加载
lotsoweiners超过 1 年前
I can’t open a store without something to sell so it sounds like neither of us have an actual business model.
protocolture超过 1 年前
Its fairly correct. Theres no Open Source database of internet comments, websites, most pop art styles, troubleshooting detail, coding Q&amp;A.<p>Which is why any sane country would extend fair use to all of these things no questions asked, short of 1:1 replication.<p>Yankistan however is not a sane country and I am expecting the outcome to these legal tests to be absolutely bonkers.<p>&gt;But they make money<p>Yes, so does the guy who takes 20 public domain stories, publishes it as a collection on google books and sells it to me for 1 dollar. People act as if their models aren&#x27;t hugely backed by human labor, labor that is hugely transformative and commands a decent reward.
评论 #38964743 未加载
JohnFen超过 1 年前
Tough.<p>OpenAI&#x27;s wholesale abuse of the internet has made the internet no longer an acceptable place for me to distribute data and writing to the general public. I have no sympathy whatsoever for them.
_zephyr超过 1 年前
I do think OpenAI has a point in what they&#x27;re saying: if we expect human-level competency of AI, it needs to be able to see and train on human-accessible content and ideally with a similar distribution.<p>For example, I make an open source Firefox web extension for filtering internet content with my own classifier. That literally would not be able to exist without being able to be trained on web content, much of which is copyrighted. Requiring that I somehow either a) use only attributed data or b) detect and not use copyrighted content when trying to build something representative of my source distribution (e.g. the web) sounds like a recipe for a poor outcome. Now maybe my addon isn&#x27;t your cup of tea - but what if you found out that the next generation of uBlock Origin etc. could not be as effective because of legislation because it wanted to use an AI model? Legislating too heavily around this area will, I believe, have a tremendous chilling effect for small businesses and open source folks trying to innovate in AI.<p>I&#x27;ve also worked commercially in the creation of two closed source machine learning models, but the domains were restricted enough that web content was not a particularly helpful input. One did all right, and one did not. Seeing bets succeed and fail gives me appreciation for the long-term and uncertain bets that OpenAI has been making for ages finally coming to fruition. I think without businesses being willing to make those bets the GPU-hours would have been hard to pay for.<p>I&#x27;ve wondered if potentially a different way out of this is not restricting the use of copyrighted material in the training process itself, but rather to instead only consider the created final works. Of course there are thorny problems there, too, but I don&#x27;t see that having the same chilling effect on research and probably a lesser effect on business as well. One thing I think is clear though: we&#x27;ve reached a tipping point in the US similar to 1998 when the DMCA was legislated where the technology is forcing us to think carefully about what copyright means.<p>So I have question for those on HN who have meaningfully worked in the creation of not just AI-generated content, but in the creation of some AI model that others use freely or commercially: what seem like promising paths forward here? Or to those working in copyright law (like @williamcotton): how do you see the status quo and potential paths forward?
评论 #38938142 未加载
stubish超过 1 年前
Such a terrible tragedy if a company has to give the lions share of its billions in profit to the people who made it possible.<p>Maybe this is how the news and publishing industries get repaired.
zeruch超过 1 年前
It isn&#x27;t impossible, it&#x27;s just lacking the casual convenience of webscale infringement.<p>Can&#x27;t cut into those potential margins after all.
genman超过 1 年前
In principle they are not wrong as everything produced is effectively copyrighted and only released 70 years after authors death after what everything is completely outdated and irrelevant.
评论 #38964825 未加载
评论 #38934096 未加载
ShadowBanThis01超过 1 年前
Every time I see &quot;Open&quot;AI&#x27;s fraudulent name, it pisses me off. I&#x27;m sure it augurs a new era of non-open &quot;Open[bullshit]&quot; branding.
thrill超过 1 年前
What makes humanity is copyrighted. Sure, let&#x27;s create powerful thinking machines that understand nothing of modern society.
kelseyfrog超过 1 年前
Let them fight. Copyright transfer as it exists regarding work for hire is a legal fiction.
2OEH8eoCRo0超过 1 年前
Sucks that copyright lasts so long eh?
moogly超过 1 年前
So yet another one of those tech innovations with an absolutely broken business model only kept afloat by skirting the law. Got it.
Bjorkbat超过 1 年前
Oh, it’s very much possible to create AI models with copyrighted materials. The “impossible” part is turning said AI model into a cash cow without fairly compensating the original rights holder.
评论 #38933803 未加载
评论 #38939304 未加载
Raed667超过 1 年前
Then don&#x27;t
macinjosh超过 1 年前
Intellectual property is a farce in the age of information.
评论 #38934844 未加载
hulitu超过 1 年前
&gt; OpenAI says it&#x27;s &quot;impossible&quot; to create AI models without copyrighted material<p>Just send the BSA to visit them. Oh, wait, they _are_ the BSA. &#x2F;s<p>How time flies.
tonetegeatinst超过 1 年前
I think the real crime is that they keep on getting the PR for being &quot;open source&quot; and for safe ai development.<p>They arnt open sourcing their models and they are actually monetizing it. As for AI safety.....its so grey aread. Uncensored models have a place in this community, and trying to suppress them or claim their is no use for them is just insane to me.<p>What one person views as a moral or ethical response another might view as wrong. This is all culture and also based on your country&#x27;s views which help form individual perspectives.<p>I think a big issue with AI other than the hardware barriers is that we have no clue if the amazing developments in the field will keep up with the same pace or stagnate. We can&#x27;t accurately tell when we have hit the theoretical limit, or when we have true AGI.