TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

1 in 3 news sites block OpenAI via robots.txt

2 pointsby palewireover 1 year ago

2 comments

palewireover 1 year ago
The 392 news organizations listed at this URL have instructed OpenAI’s GPTBot to not scan their sites, according to a continual survey of 1,119 online publishers conducted by the homepages.news archive. That amounts to 35.0% of the total.<p>The artificial intelligence company has suggested it will not train future editions of ChatGPT using sites that opt out of GPTBot crawls via the robots.txt convention.<p>Our archiving system gathers each news organization’s robots.txt file twice per day. This page automatically updates with the latest results.<p>The sites we track are a best effort to cover a broad cross-section of news publishing.<p>That said, the sample is not comprehensive. It&#x27;s also primarily focused on the English language market.
jlpcslover 1 year ago
This should be opt-in so this corporate piracy is blocked by default.