TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

1 in 3 news sites block OpenAI via robots.txt

2 点作者 palewire超过 1 年前

2 条评论

palewire超过 1 年前
The 392 news organizations listed at this URL have instructed OpenAI’s GPTBot to not scan their sites, according to a continual survey of 1,119 online publishers conducted by the homepages.news archive. That amounts to 35.0% of the total.<p>The artificial intelligence company has suggested it will not train future editions of ChatGPT using sites that opt out of GPTBot crawls via the robots.txt convention.<p>Our archiving system gathers each news organization’s robots.txt file twice per day. This page automatically updates with the latest results.<p>The sites we track are a best effort to cover a broad cross-section of news publishing.<p>That said, the sample is not comprehensive. It&#x27;s also primarily focused on the English language market.
jlpcsl超过 1 年前
This should be opt-in so this corporate piracy is blocked by default.