One aspect that I feel is ignored by the comments here is the geo-political forces at work. If the US takes the position that LLMs can't use copyrighted work or has to compensate all copyright holders – other countries (e.g. China) will <i>not</i> follow suit. This will mean that US LLM companies will either fall behind or be too expensive. Which means China and other countries will probably surge ahead in AI, at least in terms of how useful the AI is.<p>That is not to say that we shouldn't do the right thing regardless, but I do think there is a feeling of "who is going to rule the world in the future?" tha underlies governmental decision-making on how much to regulate AI.
Well, firing someone for this is super weird. It seems like an attempt to censor an interpretation of the law that:<p>1. Criticizes a highly useful technology
2. Matches a potentially-outdated, strict interpretation of copyright law<p>My opinion: I think using copyrighted data to train models for sure seems classically illegal. Despite that, Humans can read a book, get inspiration, and write a new book and not be litigated against. When I look at the litany of derivative fantasy novels, it's obvious they're not all fully independent works.<p>Since AI <i>is</i> and will continue to be so useful and transformative, I think we just need to acknowledge that our laws did not accomodate this use-case, then we should change them.
The released draft report seems merely to be a litany of copyright holder complaints repeated verbatim, with little depth of reasoning to support the conclusions it makes.
I have yet to see someone explain in detail how transformer model training works (showing they understand the technical nitty gritty and the overall architecture of transformers) and also layout a case for why it is clearly a violation of copyright.<p>You can find lots of people talking about training, and you can find lots (way more) of people talking about AI training being a violation of copyright, but you can't find anyone talking about both.<p>Edit: Let me just clarify that I am talking about training, not inference (output).
Intellectual property law is quickly becoming an institution of hegemonic corporate litigation of the spreading of ideas.<p>If it's illegal to know the entire contents of a book it is arbitrary to what degree you are able to codify that knowing itself into symbols.<p>If judges are permitted to rule here it is not about reproduction of commercial goods but about control of humanity's collective understanding.
See "Copyright and Artificial Intelligence Part 3: Generative AI Training" (PDF):<p>* <a href="https://www.copyright.gov/ai/Copyright-and-Artificial-Intelligence-Part-3-Generative-AI-Training-Report-Pre-Publication-Version.pdf" rel="nofollow">https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...</a>
"But making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries."<p>I honestly can't see how this directly addresses fair use, it's a odd sweeping statement. It implies inventing something that borrows little from many different copyrighted items is somehow not fair use? If it was one for one yes, but it's not it's basically saying creativity is not fair use. If it's not saying this and refers to competition in the existing market they're making a statement about the public good, not fair use. Basically a matter for legislators and what the purpose of copyright is.
If AI companies in the US are penalized for this, then the effect on copyright holders will only be slowed until foriegn AI companies overtake them. In such cases the legal recourse will be much slower and significantly limited.
Oh boy, right again
<a href="https://news.ycombinator.com/item?id=43940763">https://news.ycombinator.com/item?id=43940763</a>
<i>Representative Joe Morelle (D-NY), wrote the termination was “…surely no coincidence he acted less than a day after she refused to rubber-stamp Elon Musk’s efforts to mine troves of copyrighted works to train AI models.”</i><p>Interesting, but everyone is mining copyrighted works to train AI models.
Earlier on the report pdf:<p><a href="https://news.ycombinator.com/item?id=43955025">https://news.ycombinator.com/item?id=43955025</a>
(this is duplicate of <a href="https://news.ycombinator.com/item?id=43960518">https://news.ycombinator.com/item?id=43960518</a>)
Ned Ludd heirs at last win - High Court rules the spinning Jenny IS ILLEGAL!. All machine made cloth and machines must be destroyed. This is the end of the road for all mechanical ways to make cloth.
Get naked, boys 'n girls = this will be fun!
> The remarks about Musk may refer to the billionaire’s recent endorsement of Twitter founder Jack Dorsey’s desire to “Delete all IP law"...<p>Yes please.<p>Delete it for everyone, not just these ridiculous autocrats. It's only helping <i>them</i> in the first place!
Big Tech: We shouldn’t pay, each individual piece of content is worth basically nothing.<p>Also Big Tech: We added 300.000.000 users worth of GTM because we trained in the 10 specific anime movies of Studio Ghibli and are selling their style.
If anyone was skeptical of the US government being deeply entrenched with these companies in letting this blatant violation of the spirit of the law [1] continue, this should hopefully secure the conclusion.<p>And for the future, here's one heuristic: if there is a profound violation of the law anywhere that (relatively speaking) is ignored or severely downplayed, it is likely that interested parties have arrived at an understanding. Or in other words, a conspiracy.<p>[1] There are tons of legal arguments on both sides, but for me it is enough to ask: if this is not illegal and is totally fair use (maybe even because, oh no look at what China's doing, etc.), why did they have to resort to & foster piracy in order to obtain this?
copyright is long overdue for a total rework<p>the internet demands it.<p>the people demand free mega upload for everybody, why? because we can (we seem to NOT want to, but that should be a politically solvable problem)
I think, A new chapter is about to begin. It seems that in the future, many IPs will become democratized — in other words, they will become public assets.
The USCO report was flawed, biased, and hypocritical. A pre-publication of this sort is also extremely unusual.<p><a href="https://chatgptiseatingtheworld.com/2025/05/12/opinion-why-the-copyright-offices-pre-publication-report-is-flawed-both-procedurally-and-substantively/" rel="nofollow">https://chatgptiseatingtheworld.com/2025/05/12/opinion-why-t...</a>
Two different issues that while apparently related need separate consideration. Re the copyright finding, does the US copyright office have standing to make such a determination? Presumably not since various claims about AI and copyright are before the courts. Why did they write this finding?