TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Google's fact-checking bots build vast knowledge bank

136 点作者 spountzy将近 11 年前

17 条评论

bra-ket将近 11 年前
Kevin Murphy (<a href="https://github.com/murphyk" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;murphyk</a>) is the lead developer of Bayes Net toolbox (<a href="https://code.google.com/p/bnt/" rel="nofollow">https:&#x2F;&#x2F;code.google.com&#x2F;p&#x2F;bnt&#x2F;</a>) and PMTK: <a href="https://github.com/probml/pmtk3" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;probml&#x2F;pmtk3</a><p>This knowledge graph is probably the largest Bayesian network out there
评论 #8215022 未加载
sixQuarks将近 11 年前
This is going to set the stage for the next battle between spammers and Google.<p>spammers will be populating the web with &quot;facts&quot; that suit themselves.
评论 #8213174 未加载
评论 #8214856 未加载
评论 #8212937 未加载
dm2将近 11 年前
&gt;&gt; &quot;Behind the scenes, Google doesn&#x27;t only have public data,&quot; says Suchanek. It can also pull in information from Gmail, Google+ and Youtube.&quot;You and I are stored in the Knowledge Vault in the same way as Elvis Presley,&quot; Suchanek says.<p>I really hope Google does not use Gmail data for projects other than ads. They really needs to ask users to opt-in to this kind of data sharing. I&#x27;m ok with gmail being read for ads, but almost anything else is unethical, especially some experimental knowledge base.
评论 #8213029 未加载
评论 #8213013 未加载
评论 #8214900 未加载
评论 #8212773 未加载
评论 #8213106 未加载
评论 #8212843 未加载
discardorama将近 11 年前
How does this compare with NELL[0] from CMU? I&#x27;m assuming it&#x27;s something like NELL, but scaled up 1000x because Google is not limited to how often it can search its own index, whereas NELL is limited to 10K queries&#x2F;day?<p>[0] <a href="http://rtw.ml.cmu.edu/rtw/" rel="nofollow">http:&#x2F;&#x2F;rtw.ml.cmu.edu&#x2F;rtw&#x2F;</a>
murphyk超过 10 年前
Hi, I’m Kevin Murphy, one of the researchers at Google who worked on this project. Just to be clear, KV did NOT involve any private data sources -- it just analyzed public text on the web. (And yes, we do try to estimate reliability of the facts before incorporating them into KV.) Also, KV is not a launched product, and is not replacing Knowledge Graph.<p>Unfortunately, I cannot do a more detailed Q&amp;A here, but if you want more details, please read the original paper here: <a href="http://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf" rel="nofollow">http:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~nlao&#x2F;publication&#x2F;2014.kdd.pdf</a>. (Note that an earlier version of the work was presented at a CIKM workshop in Oct 2013 (see <a href="http://www.akbc.ws/2013/" rel="nofollow">http:&#x2F;&#x2F;www.akbc.ws&#x2F;2013&#x2F;</a> and <a href="http://cikm2013.org/industry.php#kevin" rel="nofollow">http:&#x2F;&#x2F;cikm2013.org&#x2F;industry.php#kevin</a>). We have also published tons of great related research at <a href="http://research.google.com/pubs/papers.html" rel="nofollow">http:&#x2F;&#x2F;research.google.com&#x2F;pubs&#x2F;papers.html</a>
dctoedt将近 11 年前
Sounds a bit like Douglas Lenat&#x27;s CYC project from the 1980s [1], but done by machine.<p>[1] <a href="http://en.wikipedia.org/wiki/Cyc" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Cyc</a>
turbolent将近 11 年前
Paper: <a href="http://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf" rel="nofollow">http:&#x2F;&#x2F;www.cs.cmu.edu&#x2F;~nlao&#x2F;publication&#x2F;2014.kdd.pdf</a>
batbomb将近 11 年前
HNers interested in this might also be interested in Deep Dive from Stanford CS Professor Chris Ré.<p><a href="http://deepdive.stanford.edu/" rel="nofollow">http:&#x2F;&#x2F;deepdive.stanford.edu&#x2F;</a>
评论 #8216468 未加载
panarky将近 11 年前
<i>It might even be possible to use a knowledge base as detailed and broad as Google&#x27;s to start making accurate predictions about the future based on analysis and forward projection of the past.</i><p>Hello Hari Seldon, psychohistory and mathematical sociology!<p><a href="http://en.wikipedia.org/wiki/Foundation_series" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Foundation_series</a><p><a href="http://en.wikipedia.org/wiki/Mathematical_sociology" rel="nofollow">http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Mathematical_sociology</a>
walterbell将近 11 年前
Is any subset of the &quot;derived knowledge&quot; from public websites and data contributed back to a public dataset like Dbpedia?<p>There are bots [1] making Wikipedia contributions, Google could also make automated contributions to Wikipedia&#x2F;Wikidata.<p>[1] <a href="http://wikipedia-edits.herokuapp.com/" rel="nofollow">http:&#x2F;&#x2F;wikipedia-edits.herokuapp.com&#x2F;</a>
评论 #8213107 未加载
jnbiche将近 11 年前
I see a lot of downvoting here of posts that express very reasonable concerns about privacy <i>if</i> Google is actually using private emails for this AI.<p>That Google is engaging in this behavior is indeed speculation, as far as I know. However, Google employees&#x2F;allies have to realize that attempts to suppress debate on this issue can only backfire on them. Indeed, the fact that they don&#x27;t have explicit policy on this (correct me if I&#x27;m wrong) is one of the reasons researchers are speculating.<p>It may well be that most people would agree with and&#x2F;or permit Google to use their data in this way, but people should be given the opportunity to debate it in a reasonable fashion, else it looks like it was forced down their throats. And that&#x27;s no good for anyone.
dave_sullivan将近 11 年前
&gt;&gt; &quot;Behind the scenes, Google doesn&#x27;t only have public data,&quot; says Suchanek. It can also pull in information from Gmail, Google+ and Youtube.&quot;You and I are stored in the Knowledge Vault in the same way as Elvis Presley,&quot; Suchanek says.<p>Ugh... that&#x27;s a bit much... because now any employee at google could potentially get access to random facts about me gleaned from my personal and business emails? Good luck keeping different levels of confidential information segregated correctly. That&#x27;s awesome.
评论 #8213166 未加载
holri将近 11 年前
Facts are not knowledge. Read Socrates &#x2F; Platon.
评论 #8215399 未加载
ck2将近 11 年前
Isn&#x27;t it nice that millions of people made web pages that Google decided to scrape to harvest the work of others and run ads next to it for themselves?<p>Now try scraping Google and see what they do to you.
评论 #8212980 未加载
评论 #8212803 未加载
plicense将近 11 年前
&quot;Knowledge Vault has pulled in 1.6 billion facts to date&quot;, does this fact also include the fact that I am adding more facts right now? What fact metric is this fact?
hanula将近 11 年前
Are there any open source efforts like this?
illumen将近 11 年前
Knowing the people who have left Google, who collected a lot of that data, who we trusted, who are now gone, I wonder what other non-public data is being used, and how is it being used, and for only good purposes, or for nefarious purposes?