TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

I've resigned from my role leading the Audio team at Stability AI

66 点作者 georgehill超过 1 年前

11 条评论

abracadaniel超过 1 年前
Humans are allowed to train themselves on copyrighted works without custom contracts, even though we may then go on to generate profitable work from our experience. An aspiring musician will be heavily influenced by what they’ve already heard, even borrowing melodies or styles from existing songs.<p>Maybe current AI output is still too similar to the training data, but that seems like more of a reason to regulate the output and not the input. We already have legal frameworks to prevent people from replicating the work of others. I don’t understand why there needs to be a distinction from using copyrighted works to train computers vs people.
评论 #38284437 未加载
评论 #38284336 未加载
评论 #38284267 未加载
评论 #38284217 未加载
评论 #38286096 未加载
Ukv超过 1 年前
&gt; one of the factors affecting whether the act of copying is fair use, according to Congress, is “the effect of the use upon the potential market for or value of the copyrighted work”. Today’s generative AI models can clearly be used to create works that compete with the copyrighted works they are trained on. So I don’t see how using copyrighted works to train generative AI models of this nature can be considered fair use.<p>Fair use&#x27;s factors each weigh for or against a finding of fair use, as opposed to needing to strictly satisfy all four. In particular, what machine learning is likely heavily resting on is that &quot;The more transformative the new work, the less will be the significance of other factors&quot; (Campbell v. Acuff-Rose Music).<p>For instance, Google Translate was trained on translator&#x27;s works, and may in part compete with the market for translations, but I&#x27;d claim is transformative by nature of adding something new (instant on-demand translation of novel text) and not merely superseding the static works it was trained on.<p>How to decide &quot;the effect of the use upon the potential market for or value of the copyrighted work&quot; is also a bit of a gray area and one of the questions the US Copyright Office were seeking comments on. Should it be about the impact models have on the market for that general class class of works? Or, the extent to which training on a specific work impacted the market for that specific work compared to if the model was not trained on that work?
评论 #38285286 未加载
aeternum超过 1 年前
Completely agree, this is why I am banning my son from all reading and listening to music until he becomes an adult.<p>He may want to be an author or musician someday and it is clearly in society&#x27;s best interest to ensure his mind remains uncorrupted by words and ideas belonging to others.
评论 #38286500 未加载
kromem超过 1 年前
What a noble move.<p>How much better might the world be had Ford done the same and not endangered the employment of individual manual auto workers with his assembly line, or Gutenberg been more mindful of those poor calligraphers who worked tirelessly for centuries in preserving the books he would otherwise have never been able to print.<p>The idea that progress is perhaps more important to society&#x27;s overall health than the preservation of the status quo is such a horrid idea and I mourn the many chiselers who lost their jobs to bronze smithing.<p>Also, maybe with Neuralink we can finally identify exactly what copyrighted works inform human creativity such that we can properly charge licensing and residuals for such usage.<p>At least then we might see advancing technology used for something <i>good</i>.
评论 #38289144 未加载
hresvelgr超过 1 年前
I&#x27;ve changed my tune on this matter a few times and where I&#x27;ve arrived right now is this: AI generated works should not be copyrightable by law and all productions should be public domain. If the entirety of publicly available intellectual property is going to be treated like public domain, the produced works should naturally become so as well. I think the &quot;theft&quot; of intellectual property for these models is inevitable and this is probably the best outcome for everyone.
评论 #38286493 未加载
theboonies超过 1 年前
So I have given this some thought, and the legal issue is not fair use unless the new work is really clearly derivative. The concern is that the system was trained on content that was not purchased or licensed. The systems effectively need a license to CONSUME &#x2F; LEARN FROM the content. The human equivalent would be purchasing a single copy of a work, or checking it out of a library. In one case the consumer purchased a license for a copy of the work, in the other the library did.<p>In the case of the generative AI companies, I think there is a good amount of content that has been consumed that was neither in the public domain or properly licensed.<p>So what is the legal remedy?
评论 #38285411 未加载
评论 #38286234 未加载
gumby超过 1 年前
Since I learn from things I read I don&#x27;t see the problem that a potential mental prosthesis could do the same any more than a mental prosthesis like a search engine shouldn&#x27;t be allowed to search copyrighted terms.<p>I can see the argument to be made, but I personally think they either come from a zero-sum mentality (lack of abundance mentality in this case) or, likely not in this authors&#x27; case, or are being called upon as a way to address a different issue (fear, typically)
martindbp超过 1 年前
I think at this point we have to ask ourselves if we want these models or not, and who will have them. If you require licensing copyrighted data then it will be so unprofitable to train them that no company will be able to at this point. In the future then only OpenAI or the other giants will be able to afford it (even more, since you need the expensive GPU clusters as well).<p>It&#x27;s also not technically possible at this point to attribute generated content back to the source training data, so either works will have to be equally compensated (including my meme posted on Reddit) or each piece will need to be individually negotiated with the rights holder. I don&#x27;t see how these models can exist in these conditions
评论 #38287198 未加载
machdiamonds超过 1 年前
Just like for humans, in terms of copyright, only the output should matter, not the training data.
ricardobeat超过 1 年前
&gt; Today’s generative AI models can clearly be used to create works that compete with the copyrighted works they are trained on<p>Humans can also create works that compete with the copyrighted works they&#x27;ve been trained on. Should music schools and conservatoriums start banning all commercial works during lessons? Should students forfeit a % of their lifetime earnings distributed to the copyright holders of everything they have listened to in their lives so far? Granted, nobody will be happy with computers taking over chunks of &#x27;creative&#x27; work, but this hard stance will not hold water.<p>I applaud the courage to stand by personal principles over the soft embrace of a nice job though. Not many people can do it, and we need more of it.
评论 #38284350 未加载
Havoc超过 1 年前
Reminds me very much of big tech’s last round: granting itself ownership of behavioural data a la surveillance capitalism.<p>By the time everyone realised its profiting at society’s cost it was too late. Attempts to put the genie back into the bottle via gdpr etc are futile<p>Same here. The ship has functionally sailed already. All the LLMs are trained on this.
评论 #38286262 未加载
评论 #38284600 未加载