TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Diffbot launches AI-powered knowledge graph of 1T facts

47 pointsby borisjabesover 6 years ago

3 comments

d--bover 6 years ago
It&#x27;s incredible people still try and do things like this.<p>This is such an over ambitious problem...<p>1. Webpages are ladden with errors, how do you deal with this? 2. Knowledge does not fit in a graph. It&#x27;s asymptically a graph, as in: I can define relationships like: This recipe contains carrots. carrots contain sugar =&gt; this recipe contains sugar. Cool. But, what about this &quot;sugar-free carrot cake recipe?&quot; Well it still contains carrots, so still contains sugar... Contradiction? =&gt; requires human curating... 3. It doesn&#x27;t even solve a real problem... Look at IBM watson, it probably knows a lot more crap than diffbot, and yet, is a pretty useless piece of software...
评论 #17886638 未加载
评论 #17885626 未加载
评论 #17885515 未加载
评论 #17885761 未加载
subhobrotoover 6 years ago
Mike,<p>fantastic work here! As someone who&#x27;s really excited about a machine readable web and have been working on it, this is fantastic! Unfortunately, while the Semantic web was to tackle this, the real life proliferation of the Semweb has been, atleast to me personally, extremely disappointing.<p>So this is a fantastic initiative, personally for me to know about.<p>Is there a plan to expose this data via a dev API of some sort for enthusiasts like us?<p>Say a SPARQL or even (Open)Graph API perhaps?<p>My experiences consulting with and working with companies interested in the domain has been that monetizing this data is extremely hard both legally and quality wise.<p>&quot;Nike Tanjun near me&quot; is a query fraught with danger. People typing this query want to find a retailer in their vicinity that sells this Nike product, but where do we source that inventory list from and how do we get our cut?<p>Before people start talking about DSPs and SSPs, this is a very different problem at hand.<p>To know that Nike Tanjun is a shoe sold by Nike, an ontology needs to exist that captures this knowledge so that the user&#x27;s query can be decoded.<p>How will that ontology be sourced? Further, for it to be usable commercially, Nike has to agree to that encoding. Therin begin the challenges. If we encoded Nike Downshifter to be Tanjun, by mistake, then the user bought them based off our results, disliked them expecting the Downshifters to be like Tanjuns, we have an issue and Nike could persue the matter because we mislead the customer and affected their branding.<p>My primary clients are search companies or companies that want to provide rich search functionality: Google, Bing or even DDG do a phenomenal job in this space and the barrier to entry is pretty high.<p>So knowledge quality, mainteanaanace, versioning and temporal resolution (&quot;The President of the U.S.&quot;, &quot;The iPhone&quot; are different entities over time) aside, is diffbot going to monetize this knowledge only as a B2B offering&#x2F;addon to their clients or are there other &quot;bigger&quot; plans to monetize this tremendous undertaking and keep it rolling in the future?
johnymontanaover 6 years ago
What technologies are this built on? Something like Neo4j graph database?
评论 #17887878 未加载
评论 #17887644 未加载