TechEcho

Part1: https://news.ycombinator.com/item?id=20256681 (closed to new comments)What (legal?) trick allows search engines to crawl(well, we know that "crawl" is synonim of "scrape") and index content protected by terms of use? Is it "fair use" or something else?One example: Craigs List!In their terms of service:> USE. Unless licensed by us in a written agreement, you agree not to use or provide software (except general purpose web browsers and email clients) or services that interact or interoperate with CL, e.g. for downloading, uploading, creating/accessing/using an account, posting, flagging, emailing, searching, or mobile use. You agree not to copy/collect CL content via robots, spiders, scripts, scrapers, crawlers, or any automated or manual equivalent (e.g., by hand).On the other hand: https://www.google.com/search?q=site%3Asfbay.craigslist.org+couch&oq=site%3Asfbay.craigslist.org+couchGoogle is able to index CL and you can query the google index specifying "use only this CL city" and you can see the ads, and we know Google making money with it (advertising for example).I can not imagine google obtaining "written agreement" from CL ))

2 comments

gitgudalmost 6 years ago

Crawling is not entirely synonymous with scraping. Crawling web pages is usually done with the specific purpose of search and categorisation, whereas scraping is much more generalized... Data mining, mirroring, avoiding the official API limits etc...Google may have written agreements with Craigslist, they're both enormous companies...Finally, as others have said it's a legal grey area. It's not completely clear and it basically depends on what websites you're scraping, how you use the data and why...Maybe it's best to just ask the website?

backend-dev-33almost 6 years ago

And here Part1 as link: <a href="https://news.ycombinator.com/item?id=20256681" rel="nofollow">https://news.ycombinator.com/item?id=20256681</a>

2 comments

gitgudalmost 6 years ago

backend-dev-33almost 6 years ago

And here Part1 as link: <a href="https://news.ycombinator.com/item?id=20256681" rel="nofollow">https://news.ycombinator.com/item?id=20256681</a>

Ask HN: What’s the legality of web scraping? Part 2

2 comments

Ask HN: What’s the legality of web scraping? Part 2

2 comments