TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Building an AlphaZero AI using Python and Keras

406 pointsby datashrimpover 7 years ago

16 comments

glinscottover 7 years ago
There is a public distributed effort happening for Go right now: <a href="http:&#x2F;&#x2F;zero.sjeng.org&#x2F;" rel="nofollow">http:&#x2F;&#x2F;zero.sjeng.org&#x2F;</a>. They&#x27;ve been doing a fantastic job, and just recently fixed a big training bug that has resulted in a large strength increase.<p>I ported over from GCP&#x27;s Go implementation to chess: <a href="https:&#x2F;&#x2F;github.com&#x2F;glinscott&#x2F;leela-chess" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;glinscott&#x2F;leela-chess</a>. The distributed part isn&#x27;t ready to go yet, we are still working the bugs out using supervised training, but will be launching soon!
评论 #16241949 未加载
评论 #16243205 未加载
评论 #16241999 未加载
评论 #16242059 未加载
评论 #16243615 未加载
评论 #16243143 未加载
superbatfishover 7 years ago
In 1989, Victor Allis &quot;solved&quot; the game of Connect 4, proving (apparently) that the first player can always force a win, even if both sides play perfectly.<p>In 1996, Giuliano Bertoletti implemented Victor Allis&#x27;s strategy in a program named Velena:<p><a href="http:&#x2F;&#x2F;www.ce.unipr.it&#x2F;~gbe&#x2F;velena.html" rel="nofollow">http:&#x2F;&#x2F;www.ce.unipr.it&#x2F;~gbe&#x2F;velena.html</a><p>It&#x27;s written in C. If someone can get it to compile on a modern system, it would be interesting to see how well the AlphaZero approach fares against a supposedly perfect AI.
评论 #16242740 未加载
评论 #16245286 未加载
chrisfosterelliover 7 years ago
Can someone share some intuition of the tradeoffs between monte-carlo tree search compared to vanilla policy gradient reinforcement learning?<p>MCTS has gotten really popular as of AlphaZero, but it&#x27;s not clear to me how this compares to more simple reinforcement learning techniques that just have a softmax output of the possible moves the agent can make. My intuition is that MCTS is better for planning, but takes longer to train&#x2F;evaluate. Is that true? Is there some games one will work better than the other?
评论 #16240847 未加载
评论 #16241820 未加载
评论 #16240743 未加载
评论 #16244281 未加载
评论 #16240746 未加载
bluetwoover 7 years ago
FYI: The AlphaGo documentary is now on Netflix.
评论 #16241833 未加载
评论 #16241772 未加载
评论 #16244933 未加载
评论 #16245031 未加载
frenchie4111over 7 years ago
Shameless self plug. I spent a Saturday morning doing a similar (no monte-carlo, no AI library) thing recently with tic-tac-toe. I based this mostly on intuition, would love any feedback.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;frenchie4111&#x2F;genetic-algorithm-playground&#x2F;blob&#x2F;master&#x2F;tictactoe.ipynb" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;frenchie4111&#x2F;genetic-algorithm-playground...</a>
评论 #16240879 未加载
评论 #16241276 未加载
Avery3Rover 7 years ago
Use this to get rid of the obnoxiously large sticky header <a href="https:&#x2F;&#x2F;alisdair.mcdiarmid.org&#x2F;kill-sticky-headers&#x2F;" rel="nofollow">https:&#x2F;&#x2F;alisdair.mcdiarmid.org&#x2F;kill-sticky-headers&#x2F;</a>
Will_Parkerover 7 years ago
&gt; Not quite as complex as Go, but there are still 4,531,985,219,092 game positions in total, so not trivial for a laptop to learn how to play well with zero human input.<p>That&#x27;s a small enough state space that it is indeed trivial to brute force it on a laptop.<p>Putting aside that though, it would be interesting to compare vs a standard alpha-beta pruning minimax algorithm running at various depth levels.
smortazover 7 years ago
Thanks for the great demo! Uploaded to Azure Notebooks in case anyone wants to run&#x2F;play&#x2F;edit...<p><a href="https:&#x2F;&#x2F;notebooks.azure.com&#x2F;smortaz&#x2F;libraries&#x2F;Demo-DeepReinforcementLearning" rel="nofollow">https:&#x2F;&#x2F;notebooks.azure.com&#x2F;smortaz&#x2F;libraries&#x2F;Demo-DeepReinf...</a><p>Click Clone to get your own copy, then Run the run.ipynb file.
thrw3249845over 7 years ago
As an aside, does anybody know the monospace font that we see in the screenshots? Here, for instance: <a href="https:&#x2F;&#x2F;cdn-images-1.medium.com&#x2F;max&#x2F;1200&#x2F;1*8zfDGlLuXfiLGnWlzvZwmQ.png" rel="nofollow">https:&#x2F;&#x2F;cdn-images-1.medium.com&#x2F;max&#x2F;1200&#x2F;1*8zfDGlLuXfiLGnWlz...</a>
评论 #16241213 未加载
评论 #16241453 未加载
datascientistover 7 years ago
RISE Lab&#x27;s Ray platform (now includes RLlib) is another option <a href="https:&#x2F;&#x2F;www.oreilly.com&#x2F;ideas&#x2F;introducing-rllib-a-composable-and-scalable-reinforcement-learning-library" rel="nofollow">https:&#x2F;&#x2F;www.oreilly.com&#x2F;ideas&#x2F;introducing-rllib-a-composable...</a>
wyattkover 7 years ago
Does anyone have a different link to this? It&#x27;s insecure and Cisco keeps blocking it, so I can&#x27;t just proceed from Chrome.
评论 #16240693 未加载
评论 #16240690 未加载
m3kw9over 7 years ago
I like the title more if it’s “Roll your own Alpha zero using Keras and python”
poopchuteover 7 years ago
Is there magic incantation I have to say to get this to compile? Jupyter says I&#x27;m missing things when I try to run it (despite installing the things with pip)
make3over 7 years ago
AI trained for a perfect simulation is not AI. It is exactly the only easy scenario where AI is easy.
评论 #16245505 未加载
_pdp_over 7 years ago
Very nice find! I loved it!
X6S1x6Okd1stover 7 years ago
If you actually want to contribute towards an open source AlphaZero implementation you may want to checkout <a href="https:&#x2F;&#x2F;github.com&#x2F;gcp&#x2F;leela-zero" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;gcp&#x2F;leela-zero</a>