Grand-Master Level Chess Without Search: Modeling Choices and Their Implications

102 点作者 georgehill超过 1 年前

13 条评论

janalsncm超过 1 年前

I agree with the author’s critique of maximalist interpretations of the paper.I’ve built a chess engine in the past so I was less hung up on the “without search” component like a lot of the original thread seemed to be. I don’t care if I have to take an argmax over all possible moves in a position and whether that technically qualifies as a search, or if the model may be doing some implicit search within its weights. What matters to me is that it is several orders of magnitude lower search required than stockfish, and those operations cost much less because they’re being executed in parallel on a TPU (GPU also works).Optimizing static evaluation for the hardware we have is still a super interesting topic. It’s arguably why deep learning took off: matrix multiplication happens to be really cheap on GPUs.

评论 #39332571 未加载

tromp超过 1 年前

This looks like a very solid critique of Deepmind's recent paper, or how it mislead people into regarding its results as much stronger than they really are. However I think the autor goes a bit astray at> The paper mentions that the model could not follow the "don't repeat the same board position three times in a row" rule (aka "threefold repetition").That is not the rule. It is not illegal to repeat a position for a third time.The rule only says that a player can claim a draw in the case of a three fold repetition [1].[1] <a href="https://en.wikipedia.org/wiki/Threefold_repetition" rel="nofollow">https://en.wikipedia.org/wiki/Threefold_repetition</a>

评论 #39330729 未加载

skybrian超过 1 年前

This seems to be the most important point:> it looked at a game position, then used a strong chess-engine to expand a tree of possible game continuations from this position spanning hundreds of thousands if not millions of moves into the future. The game engine then used its internal knowledge of chess to assess the percentage of winning board positions within the possible continuations, which is the "probability of win" for a position. Then, the learner tried to learn this number. This is not learning from observations, it is learning from a super-human expert.Apparently it was able to summarize the knowledge of that expert pretty well, though.

qnleigh超过 1 年前

This article seems completely off-base to me. The fact that the training method can't learn 'no 3 repeated moves' doesn't change my opinion of the result at all; to me this is just a technical detail, and does not retract from the essence of the result (that you can get good performance without hard-coding search). But the author harps on this point over and over as if it's a fatal flaw.And then we get some rant about how 'it's not really learning how to plan' because it doesn't know to win in as few moves as possible. But again this just doesn't matter! The model was trained to maximize its odds of winning. You don't get to make up a new rule that you're supposed to win quickly. Totally beside the point. But somehow this observation is taken to mean that 'well it's not actually planning/reasoning/understanding. It's just imitating its training data.' Such statements are usually operationally meaningless, and the argument could just as well apply to human learning.Point taken that the training method can't learn arbitrary rules (this is a good point, and it sounds like it needed to be made, though I didn't see the Twitter-storm myself). But the tone of this article irked me a bit.

评论 #39334510 未加载

usgroup超过 1 年前

Grandmasters do search! They think many moves ahead for most moves, and obviously StockFish does search -- a lot of search, much more than a grandmaster.I feel that the sort of structures we implicitly operate over during search can be usefully "flattened" in a higher dimensional space such that whole paths become individual steps. I feel that implicitly that's the sort of thing that the network must be doing.If you watch videos of grandmasters talking about chess positions, often they'll talk about the "structure" of the position, and then confirm the structure via some amount of search. So, implicitly grandmasters probably do some kind of flattening too which connects features of the present state to future outcomes in a way requiring less mental effort.

评论 #39335266 未加载

thomasahle超过 1 年前

People misunderstand this paper. The point is not to create a strong chess engine. We already have those. It's also not to train it in a smart way. It's just supervised learning.The whole paper is just an argument against people who still think of transformers as "stochastic parrots".Probably it's even a mainly internal Deep Mind debate, since people outside have long ago accepted that transformers obviously learn algorthms and do something akind to planning.You can see this undertone many places in the paper, like when they emphasize the systems plays strongly without _explicit_ search. Or if you read the conclusion.

评论 #39335150 未加载

评论 #39333374 未加载

评论 #39333588 未加载

评论 #39333750 未加载

评论 #39333498 未加载

dang超过 1 年前

Recent and related:Grandmaster-Level Chess Without Search - <a href="https://news.ycombinator.com/item?id=39301944">https://news.ycombinator.com/item?id=39301944</a> - Feb 2024 (128 comments)

mtlmtlmtlmtl超过 1 年前

This is an excellent summary of some of the myriad problems with the paper. Especially the highly dubious claims of grandmaster level play. And also of a lot of misconceptions about things the paper didn't(and didn't claim to afaik) do at all.It's a shame it's not getting as much traction on here as the paper itself. Once again, Deepmind's shameless embellishments of their research will become HN myths because most people will only have seen the paper and not this.This is of course a general trend; critiques and even retractions rarely get as much attention as the hyperbolised form of the original research and while the field itself moves on, the laypeople come out of it less informed than they were before.

评论 #39332481 未加载

评论 #39332777 未加载

GaggiX超过 1 年前

If anyone wants to tackle it, the challenge of creating a superhuman model that doesn't use search during inference is still open. I think I have some ideas, not that I could do it without a lot of computing.

评论 #39331132 未加载

somenameforme超过 1 年前

What is the bot's username? I don't see it referenced in the paper, and in searching the Lichess database using the games offered in the paper, nothing shows up.

andrewljohnson超过 1 年前

I wonder how well you can train a similar model on a finite number of outputs from a poker solver to output answers for any poker spot in general.

评论 #39336891 未加载

cushpush超过 1 年前

I find it fascinating, how some comments find this insightful and others are almost offended by this contribution.

matteoraso超过 1 年前

The big problem with that paper is that grandmaster level chess is actually really bad for a chess bot. Its elo is a full 400 lower than Stockfish[0], which is the same distance between a grandmaster and a very strong untitled player. Oddly enough, I wrote a blog post a few months ago that's tangentially related to this topic[1]. The gist of it is that the search tree for a chess engine is the most important part, since a perfect solution for chess would just be a brute-force search of the entire game tree. Trying to get around that by building a better evaluation function is just a waste of time and money.[0] <a href="https://arxiv.org/pdf/2402.04494.pdf" rel="nofollow">https://arxiv.org/pdf/2402.04494.pdf</a>[1] <a href="https://matteoraso.github.io/board-game-engines-are-about-trees-not-evaluation-functions.html#board-game-engines-are-about-trees-not-evaluation-functions" rel="nofollow">https://matteoraso.github.io/board-game-engines-are-about-tr...</a>

评论 #39333992 未加载