TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Transformer neural net learns to run Conway's Game of Life just from examples

69 点作者 montebicyclelo6 天前

9 条评论

evrimoztamur6 天前
I would like to point out a much more exciting modelling process, whereby neural networks extract the underlying boolean logic from simulation outputs: <a href="https:&#x2F;&#x2F;google-research.github.io&#x2F;self-organising-systems&#x2F;difflogic-ca&#x2F;?hn" rel="nofollow">https:&#x2F;&#x2F;google-research.github.io&#x2F;self-organising-systems&#x2F;di...</a><p>I firmly believe that differentiable logic CA is the winner, in particular because it extracts the logic directly, and thus leads to generalize-able programs as opposed to staying stuck in matrix multiplication land.
评论 #44018303 未加载
评论 #44014652 未加载
评论 #44015595 未加载
constantcrying6 天前
To be honest an unsurprising result.<p>But I think the paper fails to answer the most important question. It alleges that this isn&#x27;t a statistical model: &quot;it is not a statistical model that predicts the most likely next state based on all the examples it has been trained on.<p>We observe that it learns to use its attention mechanism to compute 3x3 convolutions — 3x3 convolutions are a common way to implement the Game of Life, since it can be used to count the neighbours of a cell, which is used to decide whether the cell lives or dies.&quot;<p>But it is never actually shown that this is the case. It later on isn&#x27;t even alleged that this is true, rather the metric they use is that it gives the correct answers often enough, as a test for convergence and not that the net has converged to values which give the correct algorithm.<p>But there is no guarantee that it actually has learned the game. There are still learned parameters and the paper doesn&#x27;t investigate if these parameters actually have converged to something where the Net is <i>actually</i> just a computation of the algorithm. The most interesting question is left unanswered.
评论 #44013546 未加载
评论 #44013522 未加载
Dwedit6 天前
Rip John Conway, died of Covid.
Nopoint26 天前
I don&#x27;t get the point. A simple CNN with stride =1 should be able to solve it perfectly and generalize it to any size.
评论 #44013958 未加载
wrs6 天前
I was hoping for an explanation of, or some insight from, the loss curve. Training makes very little progress for a long time, then suddenly converges. In my (brief) experience with NN training, I typically see more rapid progress at the beginning, then a plateau of diminishing returns, not an S-curve like this.
评论 #44016359 未加载
bonzini6 天前
Do I understand correctly that it&#x27;s brute forcing a small grid rather than learning the algorithm?
评论 #44013577 未加载
eapriv6 天前
Great, we can spend crazy amount of computational resources and hand-holding in order to (maybe) reproduce three lines of code.
评论 #44013768 未加载
评论 #44013771 未加载
评论 #44013760 未加载
amelius6 天前
But can it condense it into a small program?
xchip6 天前
Even a simple regression will do that
评论 #44013988 未加载