TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What are some of the major problems being faced because of web scraping?

8 pointsby nachivpnover 10 years ago
Disclaimer: I work for an anti-scraping service company. I am not trying to advertise it, but simply understand problems that people are actually facing because of web scraping and how it is affecting them.

4 comments

jpetersonmnover 10 years ago
I&#x27;m sure there are instances where scraping websites causes legitimate issues, however most of the complaining I&#x27;ve seen from website operators was the perceived theft of their data. (even though it was publicly available through the browser) Not so much of a bandwidth or performance issue that the scraping causes.<p>I&#x27;m of the opinion that web scraping has an unwarranted bad reputation. As long as I&#x27;m respecting your robots.txt and not scraping behind logins, etc... then it&#x27;s no different than how Google operates.
joshschreuderover 10 years ago
I think bandwidth costs and the possibility of accidentally DDoSing the site if the scraper gets out of control are probably big issues along with the &#x27;theft of data&#x27; mentioned.
mattwritescodeover 10 years ago
Surely you should know the problems if you are working for an anti-scraping company.... Anyway...<p>Most people who own small website dont necessarily know there website is being scrapped on a daily basis (talking sole traders, tiny businesses). If they are paying for adwords or local advertising through parish or county community websites then they may think they are getting bang for the buck than they actually think. If they get 10 visitors a day and 8 of those are scrapers what does this really mean for there advertising revenue. Obviously they should be basing there return on investment against revenue but still a website is seen as a big thing for most small businesses.
评论 #8840048 未加载
iqonikover 10 years ago
Google penalising a site for not having original content may be one. Ofc, it uses bandwidth and costs the site you&#x27;re scraping resource&#x2F;money for no benefit to them.
评论 #8840058 未加载