科技回声

5 条评论

htrp超过 3 年前

I would strongly suggest looking at a guide from an actual law firm like Akin Gump [1] vs a web scraping site that provide a call to action like the below>Speak to a CrawlNow data expert today to explore new opportunities for using data to fuel growth for your business.[1] <a href="https://www.akingump.com/a/web/soxXRQ6Nw48FehNvwpdjJ1/2jiuhx/hflr-reprint-to-scrape-or-not-to-scrape-rappaport-altman-handschumacher-4819-0662-7801-v1.pdf" rel="nofollow">https://www.akingump.com/a/web/soxXRQ6Nw48FehNvwpdjJ1/2jiuhx...</a>

评论 #28689794 未加载

fiddlerwoaroof超过 3 年前

I’ve never understood why using a different user agent should make a difference. Ethically, if I can see the data in a web browser, I already have access to it and no one has any business dictating to me the programs I may use to access that data.

评论 #28689421 未加载

评论 #28689664 未加载

评论 #28690641 未加载

评论 #28690130 未加载

评论 #28690214 未加载

评论 #28689699 未加载

评论 #28689533 未加载

评论 #28689486 未加载

repiret超过 3 年前

> Trespass To Chattels is a law that governs the wrongful use of someone’s digital property.Statements like that make me suspicious of the quality of the rest of the analysis.

评论 #28689456 未加载

Jensson超过 3 年前

> A website is the property of the website’s owner.No, for example the information a user puts on linkedin is that users property. The user put it on linkedin since the user wants the world to see it, so scraping linkedin to find candidates for a job doesn't violate anyone's property rights. Linkedin might still complain about server costs which is a valid concern, but they can't say that they own the data users themselves submitted regardless of what their EULA says.Treating user submitted data as property of the host just creates lock in, I don't see any reason why that would be a good policy.

评论 #28690968 未加载

评论 #28692025 未加载

JeffCarterXerox超过 3 年前

I think we should be looking more at intent rather than the semantics of how web scraping can be achieved.Whether you're setting user agent strings or taking screenshots of content doesn't really matter. What matters is what you do with the content/data.I could build a scraper to mine data on a mass scale to stick it all in a db and instantly clear it. What are my intentions here? Learn a new skill, experiment?One example in the comments was about phone scammers. Similar phone calls have been made in jest on radio talk shows, maybe not about scamming but impersonating famous people. What differs is the intent.Proving intent is a also difficult, as initial intent could be disguised to hide a more sinister agenda, akin to a money laundering operation. But at the root of everything will ly intent and that's what you have to get to regardless of the moral arguments.

5 条评论

htrp超过 3 年前

评论 #28689794 未加载

fiddlerwoaroof超过 3 年前

评论 #28689421 未加载

评论 #28689664 未加载

评论 #28690641 未加载

评论 #28690130 未加载

评论 #28690214 未加载

评论 #28689699 未加载

评论 #28689533 未加载

评论 #28689486 未加载

repiret超过 3 年前

> Trespass To Chattels is a law that governs the wrongful use of someone’s digital property.Statements like that make me suspicious of the quality of the rest of the analysis.

The Legality of Web Scraping

5 条评论

The Legality of Web Scraping

5 条评论