TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Copyright-ignoring AI scraper bots laugh at robots.txt

6 点作者 LinuxBender大约 1 个月前

2 条评论

maniacwhat大约 1 个月前
The ai companies have shown they don&#x27;t care at all about the preferences of site owners by ignoring them.<p>I don&#x27;t see why a new language to express preferences would make any difference here.
PeterStuer大约 1 个月前
Honestly, some sites are so ridiculously malconfigured in their anti-bot zeal that it becomes a Heisenberg like dilemma.<p>E.g. I want to pull in the rss. It is there specifically for m2m. If I dare get the robots.txt, i&#x27;m flagged as a bot, and denied the whole site. <i>including</i> not just the rss but even the parts that are not denied per the robots.txt