Is AlphaZero really a breakthrough in AI?

114 pointsby borisjabesover 7 years ago

17 comments

The article glosses over why the 4 hours was possible.Firstly, a major challenge in training an AI of this sort is getting enough labelled data. They played 300,000 games from memory. Under normal circumstances, that requires access to 300,000 games played by experts so the AI can learn to copy what the export does. That is how Alpha Go did it.AlphaZero neatly side steps this by generating it's own training data by playing itself. If how to do this was "obvious", it would have been done a long time ago.Secondly, they parallelised things. Alpha Go trained the AI from the results of each game as it is played. AlphaZero played 1,250 games simultaneously, and feed the results into the AI as they became available. The result is it took well over an order of magnitude less elapsed time to train AlphaZero than Alpha Go, even though the CPU cycles may have been roughly similar.Finally, he overstates how hard it is to customise the engine (Markov algorithm + AI) to a game. There are two pointers to this. Firstly, it took them over 2 years to create Alpha Go. It became the world champion on 23 May. Now, 7 months later we have AlphaZero. But AlphaZero didn't to play just one game in those 7 months: is the best player on the planet for 3 games: Go, Chess, and Shogi.I don't know whether they customised the AI for each game, but I suspect if the input and output layers were wide enough to accommodate the largest game they could use the same one for each. The Markov engine does have to know how to make all legal moves from any game position, but coding that isn't rocket science or particularly time consuming. The AI does _not_ start out knowing those rules - it learns them from the Markov engine. It's all very DRY.This sort of engine only works are a particular style of game - one where their is only a smallish set of well known moves at each step, and the playing board is also smallish (19x19 in the case of Go, with three possible states for each position: empty, black, white). Most board and card games fit this description. AlphaZero can teach itself to play any of them to a standard higher than any human can play them, and do it within a few hours, not the decades it takes to create a human grand master. The net result is Homo sapiens reign of supremacy at playing this style of game is now over. For this entire niche our brains have been firmly relegated to a 2nd class intelligence.I don't know whether you would call pulling this off a breakthrough, but I do know the techniques they applied will be copied by man+dog for years if not decades to come.

评论 #15921113 未加载

评论 #15920776 未加载

评论 #15921127 未加载

评论 #15921813 未加载

评论 #15921889 未加载

评论 #15920730 未加载

评论 #15920928 未加载

super_marioover 7 years ago

I would like to see a rematch with Stockfish configured correctly. I give Stockfish at least 1 GB of hash per thread. In AlphaZero match they had 64 threads and only 1 GB of hash for all of them.No one knows how Stockfish behaves with that many search threads, since no one tested it. I don't know if there is any data on how Stockfish scales with number of CPUs but I seem to remember that being one of the weaknesses of the engine, and that commercial engines like Komodo scaled better with larger number of CPUs.Anecdotally on 2.8 GHz 8 core 64bit CPU with 8 threads and 16 GB hash size it calculates about 7-8 million ply per second at the beginning of the game, and much more later when there are less peaces on the board. AlphaZero's setup, with 64 threads on 32 physical CPU cores, calculated about 70 million ply per second, with tiny 1 GB hash (i.e. the engine could remember less of what it calculated previously).But my Stockfish on computationally weaker setup clearly flags some of the moves in the match as mistakes on the 64 threaded Stockfish. I would really like to understand why. Is it because if you have more resources to see deeper you see how hopeless the situation is, or is it something else?I am sure we will see more of these matches. It would be nice if Google volunteered some computing resources and entered TCEC regularly.

评论 #15922729 未加载

评论 #15926156 未加载

评论 #15924220 未加载

seanwilsonover 7 years ago

Can anyone summarise how self play works here if AlphaZero only starts out being told the rules of the game? Does it initially plays games using completely random moves as both players? Is it only told who the winner is with no other feedback? How is it able to learn e.g. that certain moves at the start eventually lead to a win?

评论 #15922382 未加载

评论 #15922046 未加载

评论 #15922351 未加载

评论 #15921960 未加载

ivanhoeover 7 years ago

Not sure how big breakthrough in Ai domain it is, but after watching some of the AplhaZero games against Stockfish (available at youtube) I'm convinced it's a true revolution in how computers play chess. It plays so much more like humans do, without the usual crazy micro management that other chess engines love to do. It's not boring to watch, makes very little weird moves that only a computer would ever make. If I didn't know what it is, I would presume that it's a real player (and extremely good one).

评论 #15922259 未加载

seanwilsonover 7 years ago

Given their track history, I don't think it's likely the DeepMind team are trying to be sneaky here. They'd be found out eventually given how big their claim is.When working in academia, I found it very common for research papers to not come with source code or enough information to allow you to replicate experiments yourself. You usually have to pester the author. I don't find the (valid) criticisms here that unusual. I'm not sure why they wouldn't release the moves for all the test games played though seeing is that should be simple to do.As for Stockfish and AlphaZero running on different hardware...AlphaZero's approach is built around taking full advantage of what TPUs can do quickly and Stockfish doesn't utilise TPUs so how are you meant to make this fair? Does Stockfish eventually level out when you throw enough hardware at it? Doesn't DeepMind's claim about AlphaZero evaluating significantly less moves per turn invalidate the criticism about the hardware used?

评论 #15922210 未加载

评论 #15926238 未加载

评论 #15922050 未加载

zellynover 7 years ago

It seems weird for someone with experience in both chess and --especially--AI to write, “This improvement on computing power paves the way for the development of newer algorithms, and probably in a few years a game like chess could be almost solved by heavily relying on brute force.”It's like they don't understand the exponential nature of depth search in chess…

zafover 7 years ago

"However, the experimental setting does not seem fair. The version of Stockfish used was not the last one but, more importantly, it was run in its released version run on a normal PC, while AlphaZero was ran using considerable higher processing power. For example, in the TCEC competition engines play against each other using the same processor."That does sound fishy,

评论 #15920979 未加载

mannykannotover 7 years ago

The author is actually claiming something more serious than the title suggests: "...all the concerns added together cast reasonable doubts about the current scientific validity of the main claims."To me, what follows does not seem to justify this claim, but it is not my field. In addition, some of his arguments seem to be beside the point - for example, he asks "Does AlphaZero completely learn from self-play?", and while saying generally, yes, he objects that encoding the rules was a non-trivial matter. While that certainly seems to be true, it does not seem to have much bearing on the claim that AlphaZero apparently learned to win through self-play. That, to me, seems to be its singular achievement: the absence of human-written tactics and strategy (unless the encoding of the rules somehow prefigured them, which is not being claimed here, and which seems highly unlikely.)

评论 #15929150 未加载

simonhover 7 years ago

When can we have an AI that plays Third Reich? But not too well, I want to at least have a chance.I'm actually not joking. I wonder how much different it would be to teach an AI like this how to play more complex games. I imagine Axis and Allies wouldn't take much, but Third Reich is notoriously complicated. The quickest war-length game I've played took a week of playing 3-4 hours per day and games like that seem to me to be much more similar to real world problems, with multiple different sorts of trade offs that interlock with each other.Are neural AIs like this actually feasible to train for problems like that or are other AI techniques better suited to it? What about games with multiple different game systems, like board games with a card game element to them like Settlers of Catan? Would you need to use several different types of AI to optimize different parts of the game?

评论 #15939346 未加载

proc0over 7 years ago

Engineering breakthrough?

jrautover 7 years ago

An AI which excels in imperfect information games (card games, Starcraft) would be a real breakthrough. Raw calculation power is bound to win games with a finite set of possibilities. The huge leap would be the ability to handle probabilites: taking guesses, making assumptions and coming into some kind of successful conclusions based on those.

评论 #15922744 未加载

erikbover 7 years ago

I'm no AI expert, but I won't start to worry about AI being used generally (the last point of the article) until it beats a really complex game like real time strategy title StarCraft.Even if we all agree that AlphaGo is the DeepBlue of Go, we are still having a few more layers to take before humans need to worry.

评论 #15921672 未加载

banachtarskiover 7 years ago

Whether it was a breakthrough or not, I have to say, the moves it played were certainly "creative" in a profound sense.

评论 #15921343 未加载

zerostar07over 7 years ago

probably a (big) incremental step over a previous breakthrough

a_imhoover 7 years ago

Considering how much ai is hyped it is getting harder and harder not to be a skeptic.

vadimbermanover 7 years ago

From what I recall, DeepMind never mastered Pacman.

gcatalfamoover 7 years ago

No, is an optimization of something already existing. An innovation, but not a breakthrough per se.Edit: this is an oversimplification

评论 #15920798 未加载

17 comments

rstuart4133over 7 years ago

评论 #15921113 未加载

评论 #15920776 未加载

评论 #15921127 未加载

评论 #15921813 未加载

评论 #15921889 未加载

评论 #15920730 未加载

评论 #15920928 未加载

super_marioover 7 years ago

评论 #15922729 未加载

评论 #15926156 未加载

评论 #15924220 未加载

seanwilsonover 7 years ago

评论 #15922382 未加载

评论 #15922046 未加载

评论 #15922351 未加载

评论 #15921960 未加载

ivanhoeover 7 years ago

评论 #15922259 未加载

seanwilsonover 7 years ago

评论 #15922210 未加载

评论 #15926238 未加载

评论 #15922050 未加载

zellynover 7 years ago

zafover 7 years ago

评论 #15920979 未加载

mannykannotover 7 years ago

评论 #15929150 未加载

simonhover 7 years ago

评论 #15939346 未加载

proc0over 7 years ago

Engineering breakthrough?

jrautover 7 years ago

评论 #15922744 未加载

erikbover 7 years ago

评论 #15921672 未加载

banachtarskiover 7 years ago

Whether it was a breakthrough or not, I have to say, the moves it played were certainly "creative" in a profound sense.

评论 #15921343 未加载

zerostar07over 7 years ago

probably a (big) incremental step over a previous breakthrough

a_imhoover 7 years ago

Considering how much ai is hyped it is getting harder and harder not to be a skeptic.

vadimbermanover 7 years ago

From what I recall, DeepMind never mastered Pacman.

gcatalfamoover 7 years ago

No, is an optimization of something already existing. An innovation, but not a breakthrough per se.Edit: this is an oversimplification

评论 #15920798 未加载