This uses a particular form of a fundamentally simple yet surprisingly powerful class of learning algorithms called regret minimization. CFR is interesting in an of itself as it specializes regret minization to play extensive form games. There are also CFR algorithms to play multiplayer and no-limit games and though the guarantees of optimality are no longer there, the players are still strong (but for now, far away from experts).<p>The article states that this algorithm is weak to bad players but that's more an artifact of resources and training method; one advantage of minimizing regret on games instead of using linear programming is that online learning versions can adapt to exploit poor play with payoff larger than the game's value.<p>I've also posted here before that RM solves 2 player Zero sum game more efficiently than linear programming and how it's related to boosting, portfolio optimization and as an abstraction of natural selection<i>.<p></i><a href="http://www.pnas.org/content/111/29/10620.full" rel="nofollow">http://www.pnas.org/content/111/29/10620.full</a>
><i>"The algorithm, named CFR+ by its creators, uses an improved version of a technique called counterfactual regret minimization (CFR). Past CFR algorithms have tried to solve poker by using several steps at each decision point: coming up with counterfactual values representing different game outcomes; applying a regret minimization approach to figure out the strategy leading to the best outcome; and averaging the latest strategy with all past strategies."</i><p>Seems like a useful algorithm for dating sites.<p>(only half-joking)
It would appear that this is some of the team at Alberta that "solved" checkers: <a href="http://en.wikipedia.org/wiki/Chinook_(draughts_player)" rel="nofollow">http://en.wikipedia.org/wiki/Chinook_(draughts_player)</a><p>The book <a href="http://www.amazon.com/One-Jump-Ahead-Jonathan-Schaeffer-ebook/dp/B002C1AN3Y/ref=sr_1_3?ie=UTF8&qid=1420764059&sr=8-3&keywords=one+jump+ahead" rel="nofollow">http://www.amazon.com/One-Jump-Ahead-Jonathan-Schaeffer-eboo...</a> One Jump Ahead details the checkers effort.<p>I highly recommend this book.<p>The "solved" aspect of resembles enumerating all possible outcomes.
Abysmal headline. The bot conquered heads-up, limit hold 'em, a game with next to no popularity. Conquering heads-up may be a good first step, though I'd reason that making a bot that can beat a full limit table is exponentially harder than making a bot that can beat heads-up limit.<p>Full table (9 or 10 person) limit hold 'em has been mostly dead in casinos for well over a decade and it only had a shelf life of a few years online, even during the boom.<p>There are better edges (for the sharks) and more excitement (for the fish) in no-limit and pot-limit games. I don't see a bot being close to be able to conquer heads-up no-limit, let alone a full table of no-limit. I suspect most heads-up limit happens in private games or the end of a limit poker tournament. It's played in scenarios that will not really be that exploitable should a person grok this bot's abilities.<p>The article also mentions that it's opponent was another bot that was also playing a very strong strategy. That suggests it's not really able to adapt to individual play, which is essential to being at profitable at all but the lowest levels of poker.
>After all, Cepheus honed its strategy by playing the equivalent of a near-perfect opponent that made practically no mistakes.<p>So why don't you f king use <i>that</i> instead of making your own program?<p>Also, how do they know that play is optimal if poker hadn't been solved before? They're comparing it to perfect play, but if we can compute that, then the problem is done anyway.
It would be nice to have more details on that CFR+ approach. If you just do what the article claims (substitute averages with regret from recent iteration) you will end up with oscillating solutions which never reach the equilibrium (I just tried it with naive CFR implementation). There is surely more to that and it would be nice to see what.<p>As to the claim of "conquering". While there is no reason to not believe them let's see how they fare vs other near optimal AIs. There is a lot of scope for numerical mistakes when you are dealing with solutions that big and it may well be that they missed something along the way. Some other teams claimed they solved HU holdem some time ago (and without supercomputers). They compete in Alberta yearly championship every year so it will be easy to see how it goes for Cepheus.
'Conquers' in the sense it always plays +EV hands. This is exactly what 'grinders' do -- you just play the hands that have a long-term positive expected value. You play that way long enough, your risk goes to zero and you normalize out at a regular return.
Misleading headline.<p>This is a complete -solution- for a limited poker version. Really strong, but stochastic, Hold'em algorithms have existed for a long time.
Best article so far about this: <a href="http://www.parttimepoker.com/heads-up-limit-holdem-has-been-solved" rel="nofollow">http://www.parttimepoker.com/heads-up-limit-holdem-has-been-...</a>
Limited still means having a choice in what to bet. Trivial EV calculations can only tell you to bet or not, so I assume it's using something more involved.
> [...] because of the huge amount of memory required; roughly 262 terabytes of memory. That’s about 268,288 times as much memory as the 1-gigabyte memory available to an iPhone 6.<p>Wow. I can't believe that sentence made it into an article, let alone one on IEEE Spectrum.<p>First, normal people don't know how much RAM is in a phone. Second, the numbers are reasonably identical (only differing by 2.4%) so what's the point of specifying it out that way? Why not figure out what an average amount of RAM in a new computer is (4GB?) and say "That's the RAM of about 70k new Desktops".