Yes, well, in a way they're right and I suspect everyone here knows it no matter how and mighty they might want act when commenting. When foreign (here 'Chinese') competition just ignores copyright laws while 'western' companies have to abide by them for every piece of data they use to train their models the former will have a clear advantage over the latter. This also happens to be how the USA acted in the 1800s [1]:<p><i>the United States declined an invitation to a pivotal conference in Berne in 1883, and did not sign the 1886 agreement of the Berne Convention which accorded national treatment to copyright holders. Moreover, until 1891 American statutes explicitly denied copyrights to citizens of other countries and the United States was notorious in the international sphere as a significant contributor to the "piracy" of foreign literary products. It has been claimed that American companies for the most part "indiscriminately reprinted books by foreign authors without even the pretence of acknowledgement" (Feather, 1994, 154). The tendency to freely reprint foreign works was encouraged by the existence of tariffs on imported books that ranged as high as 25 percent (see Dozer, 1949).</i><p>[1] <a href="http://socialsciences.scielo.org/scielo.php?script=sci_arttext&pid=S0124-59962008000100002" rel="nofollow">http://socialsciences.scielo.org/scielo.php?script=sci_artte...</a>
What I don't understand is why this is always presented as a "race" that "we" have to win or else. It's just such a strange framing to me and every time I see it, it's presented as some sort of self-evident truth, but I don't think it's self-evident at all.
It seems that most people on this site believe that this is a good thing, but all this restriction would mean is that for the next while - the only companies able to afford mass licensing would be in the SPY 500, and that's assuming these companies wouldn't just flock to a nation outside of Americas influence.<p>At some point, it becomes a national security issue. This technology is going to be leveraged in ways we can't even dream up today. Copyright law needs to be re-imagined in a way that won't restrict advancement in AI, and AI-adjacent technology. It's not because we want to - it's because we have to.
It sounds like government continuing to honor the property rights of everyone is getting in the way of a handful of rich people's desire to take all that value for themselves.
So basically, we know China is never going to pay the publishers/content creators (<i>never</i>). If we hold our principles to OpenAI (<i>pay who you took from</i>), they will go bankrupt. So of course they are speaking in end-game language. To suggest the race is lost even before it starts is an incredible thing.<p>How is it that we can theorize that the model would get better with more data, but we can't theorize that the business model would need to get bigger (pay the content creators) to train the model? Shoot first and ask questions later (or rather, BEG later).
So, does that mean that openai's models will be opensource then ? I mean, if it's built on our collective intellectual property, its only fair we have free access to it.
I think we just need to rethink copyright for language models. I'm okay just licensing 1 copy of a work to any LLM model throughout its various generations. Just don't pirate it if no special license is available, buying the ebook should suffice. It should be no different from a human buying a copy. The rule should only be that it does not leak the entire work.
It's always interesting to see how the title of a HN post radically changes the people who comment and vote. The AI friendly people are being carpet bombed by haters, but in a model release thread the haters would be flagged to oblivion.
something tells me that this pathetic messaging approach is not going to be the one that squares the circle between "piracy is illegal" and "information wants to be free"
Sorry but it is actually a huge problem for the US if the DeepSeek models are able to train on sorta-illegal dumps of scientific papers and US models aren't. The ones that are paywalled by scientific journals.<p>Everyone WILL start using hosted frontier Chinese models if they are demonstrably better at answering scientific questions than ChatGPT, sending essentially all US research questions into a Chinese data dump. This is even worse than the national security catastrophe that is TikTok (even aside from the EVEN BIGGER issue that China will have models that are staggeringly better than those in the US, because they are up to date on the science).<p>I understand the reflexivity against AI companies "stealing content" but we need to stay competitive and figure out the financial compensation later. This is not a case where our unbelievably generous copyright laws should take precedence over US competitiveness.
You have to remember a company is not a social being with balanced obligations. Its obligation is to its owners and not to society.<p>If OpenAI’s leadership weren’t saying precisely this, they wouldn’t be doing their jobs.
Copyright infrigement is not stealing[0]. The person still has what they made. Not sure why they propigate it as theft. Seems like a pro copyright propaganda extremisit article which goes significantly against progress of advancements for arts and sciences.<p>[0]<a href="https://en.m.wikipedia.org/wiki/Dowling_v._United_States_(1985)" rel="nofollow">https://en.m.wikipedia.org/wiki/Dowling_v._United_States_(19...</a>