TechEcho

11 comments

lesuorac4 months ago

I still find it very (depressingly) hilarious how everybody sees this as a lawsuit about if training on copyrighted context is legal or not.Literally, the NYT claimed that OpenAI maintained a database of NYT's works and would just verbatim surface the content. This is not an AI issue, it's settled copyright law.

评论 #42953217 未加载

评论 #42953500 未加载

评论 #42953663 未加载

评论 #42956096 未加载

rustc4 months ago

I hope they don't settle early and we finally get an answer to whether training AI on copyrighted content is fair use or not.

评论 #42954157 未加载

n0rdy4 months ago

I like following the OpenAI vs. NYT case, as it's a great example of the controversial situation:- OpenAI created their models by parsing the internet by disregarding the copyrights, licenses, etc., or looking for a law loopholes- by doing that, OpenAI (alongside others) developed a new progressive tool that is shaping the world, and seems to be the next “internet”-like (impact-wise) thing- NYT is not happy about that, as their content is their main asset- less democratic countries, can apply even less ethical practices for data mining, as the copyright laws don't work there, so one might claim that it's a question of national defense, considering the fact that AI is actively used in the miltech these days- while the ethical part is less controversial (imho, as I'm with NYT there), the legal one is more complicated: the laws might simply say nothing about this use case (think GPL vs. AGPL license), so the world might need new ones.And so on...

screye4 months ago

I can't imagine a scenario where pre-training on someone else's works is fair-use, but distilling from a proprietary LLM isn't.

pkamb4 months ago

Is anyone building a public domain repository / AI training ground for old newspapers? Anything before 1930 has no restrictions. Newspapers.com has pretty good content but the interface and search is extremely lacking. Google News was abandoned a decade ago. This seems like something where AI could really help, for once. Not in training chatbots or whatever but actually just providing great search for articles in books, newspapers, and magazines.

评论 #42953564 未加载

评论 #42953155 未加载

评论 #42954079 未加载

ViktorRay4 months ago

Would anyone here be able to explain to me where this money is going? Are the lawyers working for the New York Times really this expensive? If so these lawyers must be getting massive amounts of money...

评论 #42953038 未加载

评论 #42953072 未加载

nimish4 months ago

NYT will lose:Copyright only protects the actual text. LLMs have weights, not exact copies. In any case, saying "if I put in some input and get copyrighted output" is tantamount to copyright violations; if I use a generative tool and generate copyrighted info is it the tools fault?An LLM is a dump of effectively arbitrary numbers that, when hooked up to a command line, uses one of the world's most awful programming languages to evaluate and execute.OpenAI at most broke an EULA or some technicality on copyright w.r.t. local ephemeral copies. What's the damage to the NYT though?

评论 #42954438 未加载

评论 #42954439 未加载

评论 #42954561 未加载

评论 #42954291 未加载

gotoeleven4 months ago

Are they paying the lawyers with government money? I'm seriously asking. Why is the government paying 10s of millions of dollars/year to the New York Times? How can they still claim to be a news organization without having disclosed this? If the government is paying the NYT, then don't their productions belong in the public domain?<a href="https://x.com/stillgray/status/1887191056074350690" rel="nofollow">https://x.com/stillgray/status/1887191056074350690</a>

评论 #42954288 未加载

评论 #42954554 未加载

评论 #42965525 未加载

SebFender4 months ago

"OpenAI asserts that training AI models using publicly accessible content, including material from The New York Times, is protected under longstanding fair use principles."Incredible.The foundation of fair use is a transformative and non-consumptive use of copyrighted material.

tester7564 months ago

Why is it THAT expensive?

评论 #42952873 未加载

user39393824 months ago

My ideal solution would be to public domain anything NYT has written in the past, turn it over to archive.org, and dismantle NYT so it’s no longer an issue in the future.

11 comments

lesuorac4 months ago

评论 #42953217 未加载

评论 #42953500 未加载

评论 #42953663 未加载

评论 #42956096 未加载

rustc4 months ago

I hope they don't settle early and we finally get an answer to whether training AI on copyrighted content is fair use or not.

评论 #42954157 未加载

n0rdy4 months ago

screye4 months ago

I can't imagine a scenario where pre-training on someone else's works is fair-use, but distilling from a proprietary LLM isn't.

pkamb4 months ago

评论 #42953564 未加载

评论 #42953155 未加载

评论 #42954079 未加载

ViktorRay4 months ago

评论 #42953038 未加载

评论 #42953072 未加载

nimish4 months ago

评论 #42954438 未加载

评论 #42954439 未加载

评论 #42954561 未加载

评论 #42954291 未加载

gotoeleven4 months ago

评论 #42954288 未加载

评论 #42954554 未加载

评论 #42965525 未加载

SebFender4 months ago

tester7564 months ago

Why is it THAT expensive?

评论 #42952873 未加载

user39393824 months ago

My ideal solution would be to public domain anything NYT has written in the past, turn it over to archive.org, and dismantle NYT so it’s no longer an issue in the future.

The New York Times Has Spent $10.8M in Its Legal Battle with OpenAI So Far

11 comments

The New York Times Has Spent $10.8M in Its Legal Battle with OpenAI So Far

11 comments