This seems to be the most important point:<p>> it looked at a game position, then used a strong chess-engine to expand a tree of possible game continuations from this position spanning hundreds of thousands if not millions of moves into the future. The game engine then used its internal knowledge of chess to assess the percentage of winning board positions within the possible continuations, which is the "probability of win" for a position. Then, the learner tried to learn this number. This is not learning from observations, it is learning from a super-human expert.<p>Apparently it was able to summarize the knowledge of that expert pretty well, though.