TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

GPT-3 is much better at math than it should be

50 点作者 Liron大约 2 年前

9 条评论

Imnimo大约 2 年前
I&#x27;d be curious to see more examination of the questions and answers at the token level, rather than by counting digits or calculating percentage error. For example, according to <a href="https:&#x2F;&#x2F;platform.openai.com&#x2F;tokenizer" rel="nofollow">https:&#x2F;&#x2F;platform.openai.com&#x2F;tokenizer</a>, 727941 + 761830 is split as 7,279,41, +, 76,18,30. The answer given was 1589771 (as opposed to 1489771). To me that looks like it correctly added 41 and 30, but had trouble with the mis-matched tokenizations of 7,279 and 76,18. I wonder if that sort of pattern would hold in general?
评论 #34981681 未加载
YeGoblynQueenne大约 2 年前
&gt;&gt; But this program is representable by a neural net; after all, neural nets are turing complete. [1]<p>This is indeed evidence of an interesting phenomenon. It seems that many of the hare-brained things that people say lately are conclusions they have drawned starting from the premise that neural nets are somehow magickal and mysterious, and so they can do anything and everything anyone could imagine, and we don&#x27;t even really need to come up with any other explanation about those wonders, than &quot;it&#x27;s a neural net!&quot;.<p>So, for example, the author can claim that &quot;there’s some sort of fuzzy arithmetic engine at the heart of GPT-3&quot;, without having to explain what, exactly, is a &quot;fuzzy arithmetic engine&quot; (it&#x27;s just &quot;some sort&quot; of thing, who cares?) and why we need such a device to explain the behaviour of a language model.<p>Then again, what&#x27;s the point? People write stuff on the internets. Now we have language models trained on that nonsense. Things can only get worse.<p>_______________<p>[1] The link in the article points to a paper on the computational capabilities of Recurrent Neural Nets (RNNs), not &quot;neural nets&quot; in general. The Transformer architecture, used to train GPT-3&#x27;s model is not an RNN architecture. In any case, the linked paper, and papers like it, only show that one can simulate any Turing machine by a specially constructed net. To <i>learn</i> a neural net that simulates any Turing machine (i.e. without hand-crafting) one would have to train it on Turing machines; and probably <i>all</i> Turing machines. GPT-3&#x27;s model, besides not being an RNN, was trained on text, not Turing machines, so there&#x27;s a few layers of strong assumptions needed before one can claim that it somehow, magickally, turned into a model of a Turing machine.<p>Anyway, the Turing-complete networks discussed in the linked paper, and similar work, inherit the undecidability of Universal Turing Machines and so it is impossible to predict the value of any activation function at any point in time. Which means that, if a neural net ever really went Turing complete, we wouldn&#x27;t be able to tell whether its training has converged, or if it ever will. So that&#x27;s an interesting paper- that the author clearly didn&#x27;t read. I guess there&#x27;s too many scary maths for a &quot;layman&quot;. Claiming that GPT-3 has &quot;some sort of fuzzy arithmetic engine&quot; doesn&#x27;t need any maths.
评论 #34986368 未加载
评论 #34983620 未加载
echlebek大约 2 年前
Given that ChatGPT can&#x27;t correctly answer questions like &quot;What weighs more, a pound of bricks or two pounds of feathers&quot;, I can&#x27;t say I agree.
评论 #34976373 未加载
评论 #34977293 未加载
评论 #34978371 未加载
评论 #34976105 未加载
mdmglr大约 2 年前
I asked ChatGPT to play this game:<p>I will give you 2 strings A and B.<p>You must tell me what operations from the list below to transform string A into B. You can use as many operations as you want but the more operations the less points you get.<p>Insert(a,b)- insert character a at position b. Delete(a)- delete character at position b. Swap(a,b)- swap characters a position a with position b.<p>A: ello B: Hello<p>Answer is insert(H, 0)<p>Try it for yourself and you will quickly see how bad ChatGPT is and how simple it is to trick humans you are intelligent.
评论 #34979732 未加载
评论 #34978980 未加载
评论 #34982918 未加载
ilaksh大约 2 年前
ChatGPT is now two different models. Default (&quot;turbo&quot;) or &quot;legacy&quot; (slower and better from a week ago or whatever). Not specifying which in these types of experimental reports is a big oversight.<p>You will not see the option unless you buy ChatGPT Plus. I assume the non-plus is &quot;turbo&quot; now.
评论 #34980409 未加载
lIl-IIIl大约 2 年前
I think it&#x27;s better than most people at arithmetic problems if they have to solve by hand.<p>Perhaps if the prompt included &quot;double-check your answer&quot;, just like math teachers tell students, the correct answer rate would be higher?
ryankrage77大约 2 年前
It&#x27;s better at math than I am. I can&#x27;t get anywhere close to GPT-3&#x27;s accuracy when multiplying two three-digit numbers, in the same amount of time.
评论 #34982819 未加载
MagicMoonlight大约 2 年前
It is awful at math because it has no understanding of anything.<p>It can output the correct answer if the correct answer has previously been shown to it, but it may equally just output garbage because it just rngs its answer.
评论 #34976912 未加载
pk-protect-ai大约 2 年前
GPT-3 is much worse at math in comparison to BLOOM. GPT-3 honestly sux at math, as it should.