TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

26% of the top websites are now blocking GPTBot

62 pointsby twapiover 1 year ago

8 comments

entunoover 1 year ago
It&#x27;s worth noting that in this case &quot;blocking&quot; means &quot;asking nicely for it not to index them&quot; - so how effective this is depends on how well behaved the bots are.<p>There is a danger though if certain types of sites are more likely to block GPTBot than others, because that would end up skewing the data set that it trains off, which could have longer term impacts on all the content generated with it. If all the good quality sites block it and the sites full of AI generated junk don&#x27;t, then that sounds like a downward spiral.
评论 #37678132 未加载
galkkover 1 year ago
This is not a problem. They will just buy the data in bulk from some third party, what would do scraping for them.<p>I heard many instances of such things.
评论 #37678819 未加载
评论 #37680563 未加载
评论 #37678996 未加载
评论 #37678654 未加载
评论 #37678664 未加载
accrualover 1 year ago
I don&#x27;t own any revenue generating websites, but I do feel like the content I create is useful. I&#x27;d rather have GPT slurp it up with the hope of some small piece of me being emitted now or in the future for others to benefit from.<p>I&#x27;m sure my perspective would be different if I was paying my employees to create unique content for our brand, though.<p>I&#x27;m not sure though. At the end of the day, I think I&#x27;d rather information to be free. But that&#x27;s not a sustainable model in many industries.
评论 #37678094 未加载
评论 #37680805 未加载
marcinzmover 1 year ago
Good news for Google&#x27;s AI teams since Google&#x27;s AI scraping is harder to block unless websites want to not show up in google search results.
skilledover 1 year ago
I would love to sit down with any number of 10 random website owners &#x2F; managers from this list and ask them the following questions:<p>- Why did you block GPTBot?<p>- Are you aware that your content is scraped, directly copied and otherwise repurposed by other website that don&#x27;t block GPTBot?<p>- What are your plans if in future iterations of the GPT model you&#x27;re going to see that the GPT model has information that you wrote or produced? Are you going to fight it, and if so - how are you going to do that?<p>I think these are legitimate questions and they are the ones that I would love to hear answers to because I would love nothing more than OpenAI being hamstrung based on the bullshit that they pulled last year with ChatGPT.<p>Never forget that OpenAI stole the web and has had $11.3B in funding[0] and is seeking another round to place it at a $80-90 billion valuation[1].<p>[0]: <a href="https:&#x2F;&#x2F;www.crunchbase.com&#x2F;organization&#x2F;openai&#x2F;company_financials" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.crunchbase.com&#x2F;organization&#x2F;openai&#x2F;company_finan...</a><p>[1]: <a href="https:&#x2F;&#x2F;techcrunch.com&#x2F;2023&#x2F;09&#x2F;26&#x2F;openai-is-reportedly-raising-funds-at-a-valuation-of-80-billion-to-90-billion&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;techcrunch.com&#x2F;2023&#x2F;09&#x2F;26&#x2F;openai-is-reportedly-raisi...</a>
评论 #37679609 未加载
评论 #37680615 未加载
评论 #37679348 未加载
johnethover 1 year ago
As long as ChatGPT offers few or no financial incentives to creators of media, this percentage will increase.<p>As they are at the moment, OpenAI are parasites.
评论 #37679170 未加载
kenmacdover 1 year ago
I really don&#x27;t understand why sites would do this. To each their own, but it currently lowers my opinion of the site. I was disappointed to see NPR and Ars on the list.
评论 #37680625 未加载
tomaszsover 1 year ago
OpenAI tries to set a precedence for default approval for crawling and training AIs with copyrighted content. Compared to search crawling it doesn&#x27;t proove to offer anything in return<p>More: <a href="https:&#x2F;&#x2F;tomaszs2.medium.com&#x2F;ai-may-pirate-music-and-movies-1e931402bd20" rel="nofollow noreferrer">https:&#x2F;&#x2F;tomaszs2.medium.com&#x2F;ai-may-pirate-music-and-movies-1...</a>