TechEcho

7 comments

miketalmost 13 years ago

Wanted to thank the HN community for all your encouragement. I first released the Diffbot API as a "Show HN:" post last year (<a href="http://news.ycombinator.com/item?id=2310852" rel="nofollow">http://news.ycombinator.com/item?id=2310852</a>). $2M+ and lots of hard work later, we're powering some of the largest destination sites out there like Stumbleupon and the new Digg.

评论 #4393528 未加载

评论 #4393074 未加载

ig1almost 13 years ago

Conceptually I like the product, it's something I would consider paying for. But in practice it doesn't seem to perform that well. It misclassify things it should get right (article hosted on posterous; a youtube page; hacker news) and for some queries it just returns results for a completely different webpage.The page tagging technology looks good though.

评论 #4393679 未加载

dave_sullivanalmost 13 years ago

Really like the vision approach to classifying web pages, I've been thinking google should add this to their algo for a while (if they havent already).Classifying individual parts of pages (as Diffbot seems to be doing) is difficult, but I suspect google could take screenshots of pages reported as spam or whatever as one class and compare those to screenshots of pages w/high pr to get a pretty interesting classifier they could use as an extra datapoint. Could be an interesting experiment anyway, using data they've got lying around.

jdangualmost 13 years ago

I see some potential in ad tech.How does caching works? Is there any focus on security? Multiple geolocations?I liked the TOS :) ---- Diffbot.com is made available for personal, non-commercial, and commercial purposes. Services are provided as-is, and we do not make any guarantees on the quality or performance.

jdhuangalmost 13 years ago

I thought this was super-clever when I first came across DiffBot last year. Can't wait to see what they come out with next.Keep it up!

laserDinosauralmost 13 years ago

wow, pretty cool. I wonder though is there much use for it outside of aggregate sites like digg? Even for a site like reddit, all the content is already split up into categories by users. While this is really cool, I'm not really seeing much use for it. What are some problems that this will solve?

评论 #4392960 未加载

meliponealmost 13 years ago

what are the different types the page classifier returns?

7 comments

miketalmost 13 years ago

评论 #4393528 未加载

评论 #4393074 未加载

ig1almost 13 years ago

评论 #4393679 未加载

dave_sullivanalmost 13 years ago

jdangualmost 13 years ago

jdhuangalmost 13 years ago

I thought this was super-clever when I first came across DiffBot last year. Can't wait to see what they come out with next.Keep it up!

laserDinosauralmost 13 years ago

评论 #4392960 未加载

meliponealmost 13 years ago

what are the different types the page classifier returns?

Diffbot launches a web page classifier API: analyzes a day of Twitter

7 comments

Diffbot launches a web page classifier API: analyzes a day of Twitter

7 comments