TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Teaching a neural network to use a calculator

78 pointsby baylearnover 5 years ago

5 comments

FraserGreenleeover 5 years ago
Here the neural network was given examples of how to use the calculator for each question which means it wasn&#x27;t generating it&#x27;s own abstractions.<p>If you wanted to use this to solve other (e.g. programming) problems you would need examples of every step required for almost every problem.<p>Using neural networks in this way is akin to locality sensitive hashing, instead it should understand what it&#x27;s lowest level operators do and discover useful combinations of them that can solve new problems.
fypover 5 years ago
I haven&#x27;t been following this field, but anyone know what happened to Neural Programmer Interpreters (2015)? It seemed like such a promising direction back then. It showed that a neural network can learn to use arbitrary commands to execute algorithms such as multidigit addition and bubble sort: <a href="http:&#x2F;&#x2F;www-personal.umich.edu&#x2F;~reedscot&#x2F;iclr_project.html" rel="nofollow">http:&#x2F;&#x2F;www-personal.umich.edu&#x2F;~reedscot&#x2F;iclr_project.html</a><p>That seems like a much better demo of using blackbox tools as substeps in problem solving. Is there a reason why it shouldn&#x27;t work when the blackbox is a more complex function like sympy&#x27;s eval?
JHonakerover 5 years ago
&gt; Something that intrigued me in Saxton et. al.’s paper was how high a baseline transformer scored on probability tasks (~0.77 and ~0.73), given that working these out are a multi-step process. How could basic pattern-matching score so highly on such a task? Is mere perception enough to figure out something like the probability product rule, on such a generic architecture without any prior knowledge of numbers or probability?<p>&gt; To try and explain this, we point out that although questions are unique, a lot of them will share the same answers. For example, Calculate prob of sequence aad from abcda, Calculate prob of sequence bbz from zbbmn, and Calculate prob of sequence rpr from {r: 2, p: 1, x:2} all lead to the same answer, 1&#x2F;30.<p>&gt; Doing a bit of analysis on training set questions, we find that out of 1 million samples each, swr_p_level_set and swr_p_sequence have 977179 and 978045 unique questions, respectively. This seems reasonable, as duplicates are limited to &lt;3% of the training set and the distribution over questions appears fairly uniform.<p>&gt; On the other hand, doing analysis on training set answers reveals that out of 1 million samples eachs, swr_p_level_set and swr_p_sequence have 1458 and 1865 unique answers, respectively.<p>&gt; Counting the collective number of samples that share the top K most common answers reveals even more imbalance.<p>This is the real takeaway for me from the article.
king07828over 5 years ago
From the title, I was expecting the neural network to take an input (e.g., speech or a string &quot;5+11+3=&quot;) and then control mouse movements to push the keys on a calculator program (e.g., Windows Calculator). I.e., a neural network driving an existing user interface based on commands from a user.<p>But the article is more about using neural network transformers to build steps of a mathematical proof with each step checked by a symbolic &quot;calculator&quot;. I.e., transformers applied to mathematical proofs.
The_rationalistover 5 years ago
The fact that a neural network isn&#x27;t even able to calculate, even if only trained to do this show how limiting are neural network only AGIs.
评论 #21534537 未加载
评论 #21534447 未加载
评论 #21534598 未加载