Gary Marcus is all excited about this but rightly or wrongly openAI have always claimed[1] that training is fair use. In other words, they believe that if something is "publicly available" it is also "publicly available for training". There's nothing special (as far as I can see) about what they did or didn't do for SORA in this respect.<p>Mira Murati knows that this is a controversial viewpoint which is why she's dodging the question. To stretch that out into a whole substack is just the sort of thing Gary Marcus does.<p>Everyone knows that openAI (and other big AI training company) train on scraped data and copyright works. They claim fair use. Whether or not what they do/have done is fair use is the pivotal question in a whole raft of lawsuits. There's nothing new here as far as I can tell.<p>[1] eg in the NYT case <a href="https://copyrightblog.kluweriplaw.com/2024/02/29/is-generative-ai-fair-use-of-copyright-works-nyt-v-openai/" rel="nofollow">https://copyrightblog.kluweriplaw.com/2024/02/29/is-generati...</a>
Gary Marcus has complained his way into becoming such an authority on AI he's been in front of congress. He's never done anything and regularly contradicts himself ( claims that both they are useless but also so dangerous they should be banned).<p>The opposite of the type of person we should be supporting in the tech community.
I struggle with Gary's posts due to how effortlessly he shifts from criticizing LLMs as garbage to simultaneously asserting they pose an existential threat to humanity. An intellectually honest analysis should recognize the contradiction inherent in these positions.
Theft, or borrowing, to get started is a tried and true modern business model, isn't it? First you "borrow", then become quite useful, then everyone forgets about the first bit.<p>It appears to have made some people billionaires. Examples:<p>Spotify was started by an employee uploading their MP3 collection.<p>The FB scraped the Harvard student directory.
This thread is rapidly on its way to filling up with grey comments that also have multiple written replies endorsing them (which is a bit of a trend on this topic), can we stop doing this please?<p>The semantics of the downvote button are pretty clear on this point.