TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What’s the legality of web scraping? Part 2

8 pointsby backend-dev-33almost 6 years ago
Part1: https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=20256681 (closed to new comments)<p>What (legal?) trick allows search engines to crawl(well, we know that &quot;crawl&quot; is synonim of &quot;scrape&quot;) and index content protected by terms of use? Is it &quot;fair use&quot; or something else?<p>One example: Craigs List!<p>In their terms of service:<p>&gt; <i>USE. Unless licensed by us in a written agreement, you agree not to use or provide software (except general purpose web browsers and email clients) or services that interact or interoperate with CL, e.g. for downloading, uploading, creating&#x2F;accessing&#x2F;using an account, posting, flagging, emailing, searching, or mobile use. You agree not to copy&#x2F;collect CL content via robots, spiders, scripts, scrapers, crawlers, or any automated or manual equivalent (e.g., by hand).</i><p>On the other hand: https:&#x2F;&#x2F;www.google.com&#x2F;search?q=site%3Asfbay.craigslist.org+couch&amp;oq=site%3Asfbay.craigslist.org+couch<p>Google is able to index CL and you can query the google index specifying &quot;use only this CL city&quot; and you can see the ads, and we know Google making money with it (advertising for example).<p>I can not imagine google obtaining &quot;written agreement&quot; from CL ))

2 comments

gitgudalmost 6 years ago
Crawling is not entirely synonymous with scraping. Crawling web pages is usually done with the specific purpose of search and categorisation, whereas <i>scraping</i> is much more generalized... Data mining, mirroring, avoiding the official API limits etc...<p>Google may have written agreements with Craigslist, they&#x27;re both enormous companies...<p>Finally, as others have said it&#x27;s a legal grey area. It&#x27;s not completely clear and it basically depends on what websites you&#x27;re scraping, how you use the data and why...<p>Maybe it&#x27;s best to just ask the website?
backend-dev-33almost 6 years ago
And here Part1 as link: <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=20256681" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=20256681</a>