TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Rover: Reasoning over Rules

39 pointsby datashrimpover 5 years ago

4 comments

YeGoblynQueenneover 5 years ago
Seems to be overfitting to statistical regularities in the dataset, or in any case it completely ignores the facts and rules you give it and draws the answer from who knows where:<p><pre><code> Metals ermuf electricity. Insulators do not ermuf electricity. If something is made of gudranga then it is metal. Nails are made of gudranga. Nails conduct electricity. ROVER prediction: Nails conduct electricity. True (confidence = 0.99) </code></pre> Yes, it can tell that nails ermuf electircity:<p><pre><code> ROVER prediction: Nails ermuf electricity. True (confidence = 0.99) </code></pre> However, it also thinks that nails gudranga electricity:<p><pre><code> ROVER prediction: Nails gudranga electricity. True (confidence = 0.99) </code></pre> So in short, it is very determined to find that Nails Y electricity, for whatever Y, whether Y is something that relates nails to electricty or not.<p><a href="https:&#x2F;&#x2F;rule-reasoning.apps.allenai.org&#x2F;?p=Metals%20ermuf%20electricity.%20%0AInsulators%20do%20not%20ermuf%20electricity.%20%0AIf%20something%20is%20made%20of%20gudranga%20then%20it%20is%20metal.%20%0ANails%20are%20made%20of%20gudranga.&amp;q=Nails%20ermuf%20electricity" rel="nofollow">https:&#x2F;&#x2F;rule-reasoning.apps.allenai.org&#x2F;?p=Metals%20ermuf%20...</a>.
andreykover 5 years ago
Here&#x27;s a link to the paper: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2002.05867" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2002.05867</a><p>Shortened Abstract: &quot;AI has long pursued the goal of having systems reason over <i>explicitly provided</i> knowledge, but building suitable representations has proved challenging. Here we explore whether transformers can similarly learn to reason (or emulate reasoning), but using rules expressed in language, thus bypassing a formal representation. We provide the first demonstration that this is possible, and characterize the extent of this capability. To do this, we use a collection of synthetic datasets that test increasing levels of reasoning complexity (number of rules, presence of negation, and depth of chaining). We find transformers appear to learn rule-based reasoning with high (99%) accuracy on these datasets, and in a way that generalizes to test data requiring substantially deeper chaining than in the training data (95%+ scores). We also demonstrate that the models transfer well to two hand-authored rulebases, and to rulebases paraphrased into more natural language. &quot;<p>The performance numbers are pretty impressive IMO. But learning from synthetic datasets is pretty perilous, hard to say if it&#x27;ll generalize well. Kudos to them for putting out a live demo.
评论 #22355347 未加载
asacalowwwover 5 years ago
Just tried a set of classic non-monotonic reasoning statements and it didn&#x27;t like it much: <a href="https:&#x2F;&#x2F;rule-reasoning.apps.allenai.org&#x2F;?p=Penguins%20are%20birds%0ABirds%20can%20typically%20fly%0APenguins%20cannot%20fly%0ATweety%20is%20a%20bird&amp;q=Can%20tweety%20fly%3F" rel="nofollow">https:&#x2F;&#x2F;rule-reasoning.apps.allenai.org&#x2F;?p=Penguins%20are%20...</a>
评论 #22349168 未加载
scribuover 5 years ago
It doesn&#x27;t seem very precise. For example, it doesn&#x27;t seem to distinguish between &quot;Is&quot; and &quot;Is like&quot;:<p><a href="https:&#x2F;&#x2F;rule-reasoning.apps.allenai.org&#x2F;?p=A%20pear%20is%20a%20type%20of%20fruit.%0AA%20pear%20is%20like%20an%20apple.&amp;q=An%20apple%20is%20a%20fruit.%0AAn%20apple%20is%20a%20pear.%0AA%20pear%20is%20an%20apple.%0AA%20fruit%20is%20a%20pear" rel="nofollow">https:&#x2F;&#x2F;rule-reasoning.apps.allenai.org&#x2F;?p=A%20pear%20is%20a...</a>.<p>Edit: Added more test cases
评论 #22349147 未加载