This reminds me of AI research using NES Games. The AI eventually became proficient at completing Mario levels, and along the way it discovered novel strategies for survival, obtaining points, and finishing levels.<p>> Check out this timestamp to watch the machine "cheat": <a href="https://youtu.be/xOCurBYI_gY?t=9m55s" rel="nofollow">https://youtu.be/xOCurBYI_gY?t=9m55s</a><p>> Researcher's site about the project: <a href="http://www.cs.cmu.edu/~tom7/mario/" rel="nofollow">http://www.cs.cmu.edu/~tom7/mario/</a><p>> The Paper: <i>The First Level of Super Mario Bros. is Easy with Lexicographic
Orderings and Time Travel...after that it gets a little tricky.</i>: <a href="http://www.cs.cmu.edu/~tom7/mario/mario.pdf" rel="nofollow">http://www.cs.cmu.edu/~tom7/mario/mario.pdf</a>
> It’s not the most powerful or widely used form of AI at the moment, but it is making something of a comeback. The ability to crack Q*bert could be read as a good omen that evolutionary algorithms are going to be very useful in the future.<p>Wow that's quite a jump to make
The title seems misleading to me. The AI isn't finding bugs by somehow examining the game's source code, it's trying random gameplay and exploiting any advantages that emerge. That it's finding previously unknown bugs seems to be almost entirely down to trying things that human players wouldn't think to do.
The case is an example of wireheading [1] and illustrates the difficulty of eliciting behaviors we <i>actually</i> desire from complex systems we do not fully understand.<p>[1] <a href="https://wiki.lesswrong.com/wiki/Wireheading" rel="nofollow">https://wiki.lesswrong.com/wiki/Wireheading</a><p>Another lesson: Evolutionary algorithms are really hard to control. Using neural networks developed through evolutionary algorithms means that we are employing a mostly opaque (though not entirely black) box created by a mechanism we can't mentally keep track of in detail. Hope that they are not deployed to control any critical systems until we get a much better grasp of them.
Well how do you say what's cheating or not? It works and it increases the evaluation score<p>In this case one possible workaround to "cheating" would be to reduce the control precision, add some jittering to control inputs or change the goal function. But I'd say if it's being done solely with using the intended controls it's not cheating (as opposed to changing memory or using a debug 'cheat code').<p>Still, even in real sports some "cheating" is allowed (see Fosbury Flop)
I always found this a good project to demonstrate AI :<a href="https://xviniette.github.io/FlappyLearning/" rel="nofollow">https://xviniette.github.io/FlappyLearning/</a> ( based on Neuro evolution ) - speed it up for faster results
Can we put AI to work on proving that we live in a simulation? I would never enter/exit my apartment 38 times alternating between forwards, backwards and each side, but an AI would. Maybe then all the walls start flashing and then we'll know!