TechEcho

westside1506about 16 years ago

Hi guys. We were actually about to do an "Ask HN: Review our startup" post, but I guess someone beat us to it.So, please review our startup. :)We are launching the beta today to a handful of users and will be letting in more and more users over time.One other note: We don't just offer crawling. Our model is actually to allow you to analyze the web content that you discover. Using your own custom code that you push into 80legs, you can do sophisticated text processing, image processing, look inside PDFs, etc.

评论 #552605 未加载

评论 #552630 未加载

评论 #552967 未加载

mjsabout 16 years ago

Interesting, it's a botnet! From the FAQ: "How can the prices be so low?" "Plura pays developers to embed lightweight widgets in their desktop applications or websites. These widgets harness the idle and excess bandwidth and computing power on the computers of people using the applications and websites."

评论 #552583 未加载

评论 #552537 未加载

gojomoabout 16 years ago

Very interesting service! A number of questions...What User-Agent do you use?Do you crawl non-textual resources?Do you save all headers from the crawled responses?Do you perform any processing on the returned content (like de-chunking or de-compressing) or can it be retrieved verbatim?If two customers request the same URL/site be crawled, are their requests merged so the site is only crawled once?Do you save the exact time of the request (not trusting the returned 'Date' header)?

80 legs: Web Crawler as a Service

3 comments

80 legs: Web Crawler as a Service

3 comments