TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GPT-3 is much better at math than it should be

50 pointsby Lironabout 2 years ago

9 comments

Imnimoabout 2 years ago
I&#x27;d be curious to see more examination of the questions and answers at the token level, rather than by counting digits or calculating percentage error. For example, according to <a href="https:&#x2F;&#x2F;platform.openai.com&#x2F;tokenizer" rel="nofollow">https:&#x2F;&#x2F;platform.openai.com&#x2F;tokenizer</a>, 727941 + 761830 is split as 7,279,41, +, 76,18,30. The answer given was 1589771 (as opposed to 1489771). To me that looks like it correctly added 41 and 30, but had trouble with the mis-matched tokenizations of 7,279 and 76,18. I wonder if that sort of pattern would hold in general?
评论 #34981681 未加载
YeGoblynQueenneabout 2 years ago
&gt;&gt; But this program is representable by a neural net; after all, neural nets are turing complete. [1]<p>This is indeed evidence of an interesting phenomenon. It seems that many of the hare-brained things that people say lately are conclusions they have drawned starting from the premise that neural nets are somehow magickal and mysterious, and so they can do anything and everything anyone could imagine, and we don&#x27;t even really need to come up with any other explanation about those wonders, than &quot;it&#x27;s a neural net!&quot;.<p>So, for example, the author can claim that &quot;there’s some sort of fuzzy arithmetic engine at the heart of GPT-3&quot;, without having to explain what, exactly, is a &quot;fuzzy arithmetic engine&quot; (it&#x27;s just &quot;some sort&quot; of thing, who cares?) and why we need such a device to explain the behaviour of a language model.<p>Then again, what&#x27;s the point? People write stuff on the internets. Now we have language models trained on that nonsense. Things can only get worse.<p>_______________<p>[1] The link in the article points to a paper on the computational capabilities of Recurrent Neural Nets (RNNs), not &quot;neural nets&quot; in general. The Transformer architecture, used to train GPT-3&#x27;s model is not an RNN architecture. In any case, the linked paper, and papers like it, only show that one can simulate any Turing machine by a specially constructed net. To <i>learn</i> a neural net that simulates any Turing machine (i.e. without hand-crafting) one would have to train it on Turing machines; and probably <i>all</i> Turing machines. GPT-3&#x27;s model, besides not being an RNN, was trained on text, not Turing machines, so there&#x27;s a few layers of strong assumptions needed before one can claim that it somehow, magickally, turned into a model of a Turing machine.<p>Anyway, the Turing-complete networks discussed in the linked paper, and similar work, inherit the undecidability of Universal Turing Machines and so it is impossible to predict the value of any activation function at any point in time. Which means that, if a neural net ever really went Turing complete, we wouldn&#x27;t be able to tell whether its training has converged, or if it ever will. So that&#x27;s an interesting paper- that the author clearly didn&#x27;t read. I guess there&#x27;s too many scary maths for a &quot;layman&quot;. Claiming that GPT-3 has &quot;some sort of fuzzy arithmetic engine&quot; doesn&#x27;t need any maths.
评论 #34986368 未加载
评论 #34983620 未加载
echlebekabout 2 years ago
Given that ChatGPT can&#x27;t correctly answer questions like &quot;What weighs more, a pound of bricks or two pounds of feathers&quot;, I can&#x27;t say I agree.
评论 #34976373 未加载
评论 #34977293 未加载
评论 #34978371 未加载
评论 #34976105 未加载
mdmglrabout 2 years ago
I asked ChatGPT to play this game:<p>I will give you 2 strings A and B.<p>You must tell me what operations from the list below to transform string A into B. You can use as many operations as you want but the more operations the less points you get.<p>Insert(a,b)- insert character a at position b. Delete(a)- delete character at position b. Swap(a,b)- swap characters a position a with position b.<p>A: ello B: Hello<p>Answer is insert(H, 0)<p>Try it for yourself and you will quickly see how bad ChatGPT is and how simple it is to trick humans you are intelligent.
评论 #34979732 未加载
评论 #34978980 未加载
评论 #34982918 未加载
ilakshabout 2 years ago
ChatGPT is now two different models. Default (&quot;turbo&quot;) or &quot;legacy&quot; (slower and better from a week ago or whatever). Not specifying which in these types of experimental reports is a big oversight.<p>You will not see the option unless you buy ChatGPT Plus. I assume the non-plus is &quot;turbo&quot; now.
评论 #34980409 未加载
lIl-IIIlabout 2 years ago
I think it&#x27;s better than most people at arithmetic problems if they have to solve by hand.<p>Perhaps if the prompt included &quot;double-check your answer&quot;, just like math teachers tell students, the correct answer rate would be higher?
ryankrage77about 2 years ago
It&#x27;s better at math than I am. I can&#x27;t get anywhere close to GPT-3&#x27;s accuracy when multiplying two three-digit numbers, in the same amount of time.
评论 #34982819 未加载
MagicMoonlightabout 2 years ago
It is awful at math because it has no understanding of anything.<p>It can output the correct answer if the correct answer has previously been shown to it, but it may equally just output garbage because it just rngs its answer.
评论 #34976912 未加载
pk-protect-aiabout 2 years ago
GPT-3 is much worse at math in comparison to BLOOM. GPT-3 honestly sux at math, as it should.