Note: it doesn't learn from pixels but features directly from RAM; and superhuman reaction time, with performance badly degrading when human-like delays added.<p>Good discussions on Reddit: <a href="https://www.reddit.com/r/MachineLearning/comments/5vh4ae/r_a_new_foe_has_appeared_170206230_beating_the/" rel="nofollow">https://www.reddit.com/r/MachineLearning/comments/5vh4ae/r_a...</a> <a href="https://www.reddit.com/r/smashbros/comments/5vin8x/beating_the_worlds_best_at_super_smash_bros_melee/" rel="nofollow">https://www.reddit.com/r/smashbros/comments/5vin8x/beating_t...</a>
Video of the AI here, playing as the black captain falcon: <a href="https://www.youtube.com/watch?v=dXJUlqBsZtE" rel="nofollow">https://www.youtube.com/watch?v=dXJUlqBsZtE</a>
We all know that Mew2King is first reinforcement learning AI capable of beating Super Smash Bros pro players.<p><a href="https://www.youtube.com/watch?v=z-1YfhUFtbY&feature=youtu.be&t=285" rel="nofollow">https://www.youtube.com/watch?v=z-1YfhUFtbY&feature=youtu.be...</a>
While the AI might be cheating by taking salient features from RAM rather than from pixel values, this is still an incredible feat. Just a few years ago we did not have generic algorithms that could take even salient features and self-learn policies to near this level this quickly.
What's the key insight here compared to previous systems?. As far as I can tell, still no one can beat simple non-deterministic games that require some planning.<p>My favorite example is Ms. Pac Man because it seems so old and simplistic. Been tried by a dozen teams and no one can beat a decent human.