TechEcho

16 comments

glinscottover 7 years ago

There is a public distributed effort happening for Go right now: <a href="http://zero.sjeng.org/" rel="nofollow">http://zero.sjeng.org/</a>. They've been doing a fantastic job, and just recently fixed a big training bug that has resulted in a large strength increase.I ported over from GCP's Go implementation to chess: <a href="https://github.com/glinscott/leela-chess" rel="nofollow">https://github.com/glinscott/leela-chess</a>. The distributed part isn't ready to go yet, we are still working the bugs out using supervised training, but will be launching soon!

评论 #16241949 未加载

评论 #16243205 未加载

评论 #16241999 未加载

评论 #16242059 未加载

评论 #16243615 未加载

评论 #16243143 未加载

superbatfishover 7 years ago

In 1989, Victor Allis "solved" the game of Connect 4, proving (apparently) that the first player can always force a win, even if both sides play perfectly.In 1996, Giuliano Bertoletti implemented Victor Allis's strategy in a program named Velena:<a href="http://www.ce.unipr.it/~gbe/velena.html" rel="nofollow">http://www.ce.unipr.it/~gbe/velena.html</a>It's written in C. If someone can get it to compile on a modern system, it would be interesting to see how well the AlphaZero approach fares against a supposedly perfect AI.

评论 #16242740 未加载

评论 #16245286 未加载

chrisfosterelliover 7 years ago

Can someone share some intuition of the tradeoffs between monte-carlo tree search compared to vanilla policy gradient reinforcement learning?MCTS has gotten really popular as of AlphaZero, but it's not clear to me how this compares to more simple reinforcement learning techniques that just have a softmax output of the possible moves the agent can make. My intuition is that MCTS is better for planning, but takes longer to train/evaluate. Is that true? Is there some games one will work better than the other?

评论 #16240847 未加载

评论 #16241820 未加载

评论 #16240743 未加载

评论 #16244281 未加载

评论 #16240746 未加载

bluetwoover 7 years ago

FYI: The AlphaGo documentary is now on Netflix.

评论 #16241833 未加载

评论 #16241772 未加载

评论 #16244933 未加载

评论 #16245031 未加载

frenchie4111over 7 years ago

Shameless self plug. I spent a Saturday morning doing a similar (no monte-carlo, no AI library) thing recently with tic-tac-toe. I based this mostly on intuition, would love any feedback.<a href="https://github.com/frenchie4111/genetic-algorithm-playground/blob/master/tictactoe.ipynb" rel="nofollow">https://github.com/frenchie4111/genetic-algorithm-playground...</a>

评论 #16240879 未加载

评论 #16241276 未加载

Avery3Rover 7 years ago

Use this to get rid of the obnoxiously large sticky header <a href="https://alisdair.mcdiarmid.org/kill-sticky-headers/" rel="nofollow">https://alisdair.mcdiarmid.org/kill-sticky-headers/</a>

Will_Parkerover 7 years ago

> Not quite as complex as Go, but there are still 4,531,985,219,092 game positions in total, so not trivial for a laptop to learn how to play well with zero human input.That's a small enough state space that it is indeed trivial to brute force it on a laptop.Putting aside that though, it would be interesting to compare vs a standard alpha-beta pruning minimax algorithm running at various depth levels.

smortazover 7 years ago

Thanks for the great demo! Uploaded to Azure Notebooks in case anyone wants to run/play/edit...<a href="https://notebooks.azure.com/smortaz/libraries/Demo-DeepReinforcementLearning" rel="nofollow">https://notebooks.azure.com/smortaz/libraries/Demo-DeepReinf...</a>Click Clone to get your own copy, then Run the run.ipynb file.

thrw3249845over 7 years ago

As an aside, does anybody know the monospace font that we see in the screenshots? Here, for instance: <a href="https://cdn-images-1.medium.com/max/1200/1*8zfDGlLuXfiLGnWlzvZwmQ.png" rel="nofollow">https://cdn-images-1.medium.com/max/1200/1*8zfDGlLuXfiLGnWlz...</a>

评论 #16241213 未加载

评论 #16241453 未加载

datascientistover 7 years ago

RISE Lab's Ray platform (now includes RLlib) is another option <a href="https://www.oreilly.com/ideas/introducing-rllib-a-composable-and-scalable-reinforcement-learning-library" rel="nofollow">https://www.oreilly.com/ideas/introducing-rllib-a-composable...</a>

wyattkover 7 years ago

Does anyone have a different link to this? It's insecure and Cisco keeps blocking it, so I can't just proceed from Chrome.

评论 #16240693 未加载

评论 #16240690 未加载

m3kw9over 7 years ago

I like the title more if it’s “Roll your own Alpha zero using Keras and python”

poopchuteover 7 years ago

Is there magic incantation I have to say to get this to compile? Jupyter says I'm missing things when I try to run it (despite installing the things with pip)

make3over 7 years ago

AI trained for a perfect simulation is not AI. It is exactly the only easy scenario where AI is easy.

评论 #16245505 未加载

_pdp_over 7 years ago

Very nice find! I loved it!

X6S1x6Okd1stover 7 years ago

If you actually want to contribute towards an open source AlphaZero implementation you may want to checkout <a href="https://github.com/gcp/leela-zero" rel="nofollow">https://github.com/gcp/leela-zero</a>

16 comments

glinscottover 7 years ago

评论 #16241949 未加载

评论 #16243205 未加载

评论 #16241999 未加载

评论 #16242059 未加载

评论 #16243615 未加载

评论 #16243143 未加载

superbatfishover 7 years ago

评论 #16242740 未加载

评论 #16245286 未加载

chrisfosterelliover 7 years ago

评论 #16240847 未加载

评论 #16241820 未加载

评论 #16240743 未加载

评论 #16244281 未加载

评论 #16240746 未加载

bluetwoover 7 years ago

FYI: The AlphaGo documentary is now on Netflix.

评论 #16241833 未加载

评论 #16241772 未加载

评论 #16244933 未加载

评论 #16245031 未加载

frenchie4111over 7 years ago

评论 #16240879 未加载

评论 #16241276 未加载

Avery3Rover 7 years ago

Use this to get rid of the obnoxiously large sticky header <a href="https://alisdair.mcdiarmid.org/kill-sticky-headers/" rel="nofollow">https://alisdair.mcdiarmid.org/kill-sticky-headers/</a>

Will_Parkerover 7 years ago

smortazover 7 years ago

thrw3249845over 7 years ago

评论 #16241213 未加载

评论 #16241453 未加载

datascientistover 7 years ago

wyattkover 7 years ago

Does anyone have a different link to this? It's insecure and Cisco keeps blocking it, so I can't just proceed from Chrome.

评论 #16240693 未加载

评论 #16240690 未加载

m3kw9over 7 years ago

I like the title more if it’s “Roll your own Alpha zero using Keras and python”

poopchuteover 7 years ago

Is there magic incantation I have to say to get this to compile? Jupyter says I'm missing things when I try to run it (despite installing the things with pip)

make3over 7 years ago

AI trained for a perfect simulation is not AI. It is exactly the only easy scenario where AI is easy.

评论 #16245505 未加载

_pdp_over 7 years ago

Very nice find! I loved it!

X6S1x6Okd1stover 7 years ago

Building an AlphaZero AI using Python and Keras

16 comments

Building an AlphaZero AI using Python and Keras

16 comments