TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Monte-Carlo graph search from first principles

397 点作者 bumbledraven大约 1 年前

6 条评论

fabmilo大约 1 年前
I believe this kind of graph exploration is what we need to progress reasoning in AI. Plain LLMS will fail. The link has tons of good references, including the Zobrist hashing <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Zobrist_hashing" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Zobrist_hashing</a> for game tables. We need to find a good hashing for language based state description so that graph exploration doesn&#x27;t explode computationally. Another good read for Tree Search is Thinking Fast and Slow: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1705.08439" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1705.08439</a> and Teaching Large Language Models to Reason with Reinforcement Learning: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2403.04642" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2403.04642</a> comparing the MCTS approach to other current RL strategies.
评论 #39665919 未加载
评论 #39668110 未加载
评论 #39665067 未加载
pixelpoet大约 1 年前
Immediately recognise the author in the HN url as the genius behind KataGo: <a href="https:&#x2F;&#x2F;github.com&#x2F;lightvector&#x2F;KataGo">https:&#x2F;&#x2F;github.com&#x2F;lightvector&#x2F;KataGo</a><p>His posts on <a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;cbaduk&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;cbaduk&#x2F;</a> are consistently excellent.
评论 #39663983 未加载
dooglius大约 1 年前
My chess experience is not that high, but enough that I&#x27;m skeptical of the claim that the same position would be duplicated in a search tree enough for it to matter. I&#x27;d be interested to see an actual measure of this with Leela Zero. (And I&#x27;m not even considering the threefold repetition and fifty-move rules, which when considered in the state make a repetition much less likely).
评论 #39669006 未加载
modeless大约 1 年前
&gt; Also, as far as the name &quot;Monte-Carlo Tree Search&quot; itself, readers might note that there is nothing &quot;Monte-Carlo&quot; in the above algorithm - that it&#x27;s completely deterministic!<p>MCTS as commonly implemented is deterministic? How strange! I assumed there was randomness in the sampling.
评论 #39664996 未加载
评论 #39665073 未加载
评论 #39664946 未加载
rphln大约 1 年前
Somehow the paper they mention completely flew under my radar when I was researching MCTS. Surely it&#x27;s gonna be a lot of fun to give this modification a spin on my next opportunity.
behnamoh大约 1 年前
A bit introduction about this would be nice.
评论 #39663737 未加载