The neural network of the Stockfish chess engine

355 点作者 c1ccccc1超过 4 年前

11 条评论

brilee超过 4 年前

Ironically, a lot of the tricks Stockfish is using here are reminiscent of tricks that were used in the original AlphaGo and later discarded in AlphaGoZero.In particular, the AlphaGo paper mentioned four neural networks of significance:- a policy network trained on human pro games. - a RL-enhanced policy network improving on the original SL-trained policy network. - a value network trained on games generated by the RL-enhanced policy network - a cheap policy network trained on human pro games, used only for rapid rollout simulations.The cheap rollout policy network was discarded because DeepMind found that a "slow evaluations of the right positions" was better than "rapid evaluations of questionable positions". The independently trained value network was discarded because co-training a value and policy head on a shared trunk saved a significant amount of compute, and helped regularize both objectives against each other. The RL-enhanced policy network was discarded in favor of training the policy network to directly replicate MCTS search statistics.The depth and branching factor in chess and Go are different, so I won't say the solutions ought to be the same, but it's interesting nonetheless to see the original AlphaGo ideas be resurrected in this form.The incremental updates are also related to Zobrist Hashing, which the Stockfish authors are certainly aware of.

评论 #25768199 未加载

glinscott超过 4 年前

If anyone wants to experiment with training these nets, it's a great way to get exposed to a nice mix of chess and machine learning.There are two trainers currently, the original one, which runs on CPU: <a href="https://github.com/nodchip/Stockfish" rel="nofollow">https://github.com/nodchip/Stockfish</a>, and a pytorch one which runs on GPU: <a href="https://github.com/glinscott/nnue-pytorch" rel="nofollow">https://github.com/glinscott/nnue-pytorch</a>.The SF Discord is where all of the discussion/development is happening: <a href="https://discord.gg/KGfhSJd" rel="nofollow">https://discord.gg/KGfhSJd</a>.Right now there is a lot of experimentation to try adjusting the network architecture. The current leading approach is a much larger net which takes in attack information per square (eg. is this piece attacked by more pieces than it's defended by?). That network is a little slower, but the additional information seems to be enough to be stronger than the current architecture.Btw, the original Shogi developers really did something amazing. The nodchip trainer is all custom code, and trains extremely strong nets. There are all sorts of subtle tricks embedded in there as well that led to stronger nets. Not to mention, getting the quantization (float32 -> int16/int8) working gracefully is a huge challenge.

评论 #25766568 未加载

评论 #25773242 未加载

knuthsat超过 4 年前

This is very nice. Reminds me a lot of tricks used for simple linear models. And it seems to work, given that Leela Chess Zero is losing with quite a gap.Most of the times one would learn a model by just changing a single feature and then doing the whole sum made no sense.A good example is learning a sequence of decisions, where each decision might have a cost associated to it, you can then say that the current decision depends on a previous one and vary the previous one to learn to recover from errors. If previous decision was bad, then you'd still like to make the best decision for current state.).So even if your training data does not have this error-recovery example, you can just iterate through all previous decisions and make the model learn to recover.An optimization in that case would be to just not redo the whole sum (for computing the decision function of a linear model).

评论 #25761107 未加载

dan-robertson超过 4 年前

It seems that the strategy is to use the neural network to score various moves and then a search strategy to try to find moves that result in a favourable score. And this post focuses on some of the technical engineering details to design such a scoring network. In particular the scoring is split into two parts: 1. A matrix multiplication by a sparse input vector to get a dense representation of a position, and 2. Some nonlinear and further layers after this first step. And it seems that step 1 is considered more expensive.The way this is made cheap is by making it incremental: given some board s and output of the first layer b + Ws, it is cheap to compute b + Wt where a t is a board that is similar to s (the difference is W(t-s) but the vector t-s is 0 in almost every element.)This motivates some of the engineering choices like using integers instead of floats. If you used floats then this incremental update wouldn’t work.It seems to me that a lot of the smarts of stockfish will be in the search algorithm getting good results, but I don’t know if that just requires a bit of parallelism (surely some kind of work-stealing scheduler) and brute force or if it mostly relies on some more clever strategies. And maybe I’m wrong and the key is really in the scoring of positions.

评论 #25759979 未加载

评论 #25759934 未加载

LittlePeter超过 4 年前

Leela played stockfish 200 games and won with 106 - 94 [1]. Not sure which version of stockfish was used.Some of the Leela-Stockfish games are analyzed by agadmator on YouTube [2].[1] <a href="https://www.chess.com/news/view/13th-computer-chess-championship-leela-chess-zero-stockfish" rel="nofollow">https://www.chess.com/news/view/13th-computer-chess-champion...</a>[2] <a href="https://www.youtube.com/watch?v=YtXZjKItuC8" rel="nofollow">https://www.youtube.com/watch?v=YtXZjKItuC8</a>

评论 #25760749 未加载

评论 #25759928 未加载

评论 #25759941 未加载

zetazzed超过 4 年前

The gap between white and black piece performance is massive in these top engines if I'm reading it right. LCZero won 0/ lost 4 as black and won 24 / lost zero as white (with lots of draws)? I had no idea the split was so big now. Do human tournaments look like this too these days? (From <a href="https://tcec-chess.com/" rel="nofollow">https://tcec-chess.com/</a>)

评论 #25766529 未加载

评论 #25766132 未加载

评论 #25773273 未加载

pcwelder超过 4 年前

Using a previous Stockfish scorer they trained an NN without any labelling effort. This is also similar to how unsupervised translation is done in some methods. They start from word->word dictionary results and iteratively train lang1->lang2 and lang2->lang1 models feeding on each other's output.

spiantino超过 4 年前

Great writeup!One thing I don't understand is why it would be smarter to augment the inputs with some of the missing board information - particularly the availability of castling. Even though this network is a quick-and-dirty evaluation, seems like there's room for a few additional bits in the input and being able to castle very much changes the evaluation of some common positions.

评论 #25765839 未加载

Fragoel2超过 4 年前

I don't know a lot about chess but I have one question: isn't chess a solved game (in the sense that given a board state we always can compute the right move)? why use a neural network that can introduce, a very small percentage of the time, mistakes? I guess it is for performance reasons?

评论 #25759943 未加载

评论 #25759898 未加载

评论 #25760702 未加载

评论 #25759903 未加载

评论 #25759906 未加载

评论 #25765319 未加载

评论 #25760347 未加载

评论 #25760058 未加载

评论 #25760221 未加载

nl超过 4 年前

This is a really good article.Lots of pieces about neural network design skip over the design of the representation in the input stage, which is one of the key design issues when building a custom neural network.I love how much depth this articles goes into about that representation.

billiam超过 4 年前

Analyses like this are a great indication that the main effect of refining neural networks around chess will make neural networks more exciting and ultimately make chess more boring.

评论 #25766241 未加载