TechEcho

The 392 news organizations listed at this URL have instructed OpenAI’s GPTBot to not scan their sites, according to a continual survey of 1,119 online publishers conducted by the homepages.news archive. That amounts to 35.0% of the total.<p>The artificial intelligence company has suggested it will not train future editions of ChatGPT using sites that opt out of GPTBot crawls via the robots.txt convention.<p>Our archiving system gathers each news organization’s robots.txt file twice per day. This page automatically updates with the latest results.<p>The sites we track are a best effort to cover a broad cross-section of news publishing.<p>That said, the sample is not comprehensive. It's also primarily focused on the English language market.

This should be opt-in so this corporate piracy is blocked by default.

1 in 3 news sites block OpenAI via robots.txt

2 comments

1 in 3 news sites block OpenAI via robots.txt

2 comments