TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

26% of the top websites are now blocking GPTBot

62 点作者 twapi超过 1 年前

8 条评论

entuno超过 1 年前
It&#x27;s worth noting that in this case &quot;blocking&quot; means &quot;asking nicely for it not to index them&quot; - so how effective this is depends on how well behaved the bots are.<p>There is a danger though if certain types of sites are more likely to block GPTBot than others, because that would end up skewing the data set that it trains off, which could have longer term impacts on all the content generated with it. If all the good quality sites block it and the sites full of AI generated junk don&#x27;t, then that sounds like a downward spiral.
评论 #37678132 未加载
galkk超过 1 年前
This is not a problem. They will just buy the data in bulk from some third party, what would do scraping for them.<p>I heard many instances of such things.
评论 #37678819 未加载
评论 #37680563 未加载
评论 #37678996 未加载
评论 #37678654 未加载
评论 #37678664 未加载
accrual超过 1 年前
I don&#x27;t own any revenue generating websites, but I do feel like the content I create is useful. I&#x27;d rather have GPT slurp it up with the hope of some small piece of me being emitted now or in the future for others to benefit from.<p>I&#x27;m sure my perspective would be different if I was paying my employees to create unique content for our brand, though.<p>I&#x27;m not sure though. At the end of the day, I think I&#x27;d rather information to be free. But that&#x27;s not a sustainable model in many industries.
评论 #37678094 未加载
评论 #37680805 未加载
marcinzm超过 1 年前
Good news for Google&#x27;s AI teams since Google&#x27;s AI scraping is harder to block unless websites want to not show up in google search results.
skilled超过 1 年前
I would love to sit down with any number of 10 random website owners &#x2F; managers from this list and ask them the following questions:<p>- Why did you block GPTBot?<p>- Are you aware that your content is scraped, directly copied and otherwise repurposed by other website that don&#x27;t block GPTBot?<p>- What are your plans if in future iterations of the GPT model you&#x27;re going to see that the GPT model has information that you wrote or produced? Are you going to fight it, and if so - how are you going to do that?<p>I think these are legitimate questions and they are the ones that I would love to hear answers to because I would love nothing more than OpenAI being hamstrung based on the bullshit that they pulled last year with ChatGPT.<p>Never forget that OpenAI stole the web and has had $11.3B in funding[0] and is seeking another round to place it at a $80-90 billion valuation[1].<p>[0]: <a href="https:&#x2F;&#x2F;www.crunchbase.com&#x2F;organization&#x2F;openai&#x2F;company_financials" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.crunchbase.com&#x2F;organization&#x2F;openai&#x2F;company_finan...</a><p>[1]: <a href="https:&#x2F;&#x2F;techcrunch.com&#x2F;2023&#x2F;09&#x2F;26&#x2F;openai-is-reportedly-raising-funds-at-a-valuation-of-80-billion-to-90-billion&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;techcrunch.com&#x2F;2023&#x2F;09&#x2F;26&#x2F;openai-is-reportedly-raisi...</a>
评论 #37679609 未加载
评论 #37680615 未加载
评论 #37679348 未加载
johneth超过 1 年前
As long as ChatGPT offers few or no financial incentives to creators of media, this percentage will increase.<p>As they are at the moment, OpenAI are parasites.
评论 #37679170 未加载
kenmacd超过 1 年前
I really don&#x27;t understand why sites would do this. To each their own, but it currently lowers my opinion of the site. I was disappointed to see NPR and Ars on the list.
评论 #37680625 未加载
tomaszs超过 1 年前
OpenAI tries to set a precedence for default approval for crawling and training AIs with copyrighted content. Compared to search crawling it doesn&#x27;t proove to offer anything in return<p>More: <a href="https:&#x2F;&#x2F;tomaszs2.medium.com&#x2F;ai-may-pirate-music-and-movies-1e931402bd20" rel="nofollow noreferrer">https:&#x2F;&#x2F;tomaszs2.medium.com&#x2F;ai-may-pirate-music-and-movies-1...</a>