TechEcho

4 comments

jack_deneutalmost 16 years ago

The blog post I wrote wasn't primarily about the legality of scraping (and I also didn't expect it to be read by more than a few people). But as that seems to be the topic of the thread, here's my response.The courts found that it isn't possible to copyright facts, and that's all we were scraping - things like addresses, business name, and phone number. We weren't even scraping things like business category, because something as simple as putting a restaurant in the category "Fine Dining" might be considered a judgment call and therefore value-add by the original site.And think of what would have happened if the court had found otherwise (i.e. had found that lists of facts could be copyrighted). If you opened a store, and I was the first one to put your address and phone number on-line, no one else could ever include your address or phone number on their site. Even if you created a website for your own business after I published your address, you wouldn't be able to include it on your site, because you'd violate my copyright.I can't see how the Supreme Court could have ruled any other way.

mshafriralmost 16 years ago

<pre><code> "We've tried scraping ourselves in the past (yes, it's perfectly legal)," </code></pre> Is scraping indeed "perfectly legal"?

评论 #703700 未加载

评论 #703661 未加载

评论 #703645 未加载

评论 #703686 未加载

评论 #703745 未加载

jshenalmost 16 years ago

There will always be garbage in. you're algorithms have to overcome this for the most part. Some things have to be manually dealt with and some things could be manually dealt with, but it's impossible to manually verify tens of millions of local listings.

评论 #704166 未加载

mbarralmost 16 years ago

It looks like it still needs a lot of work. As a quick test I looked for Sports Bars in London (via their categories) and it returned an Antique Shop in Westerham. I then tried editing the record to remove irrelevant categories and got a server error.

评论 #704148 未加载

4 comments

jack_deneutalmost 16 years ago

mshafriralmost 16 years ago

<pre><code> "We've tried scraping ourselves in the past (yes, it's perfectly legal)," </code></pre> Is scraping indeed "perfectly legal"?

Garbage In, Garbage Out: Why Scraping Doesn't Work for Local Search

4 comments

Garbage In, Garbage Out: Why Scraping Doesn't Work for Local Search

4 comments