科技回声

6 条评论

fabmilo大约 1 年前

I believe this kind of graph exploration is what we need to progress reasoning in AI. Plain LLMS will fail. The link has tons of good references, including the Zobrist hashing <a href="https://en.wikipedia.org/wiki/Zobrist_hashing" rel="nofollow">https://en.wikipedia.org/wiki/Zobrist_hashing</a> for game tables. We need to find a good hashing for language based state description so that graph exploration doesn't explode computationally. Another good read for Tree Search is Thinking Fast and Slow: <a href="https://arxiv.org/abs/1705.08439" rel="nofollow">https://arxiv.org/abs/1705.08439</a> and Teaching Large Language Models to Reason with Reinforcement Learning: <a href="https://arxiv.org/abs/2403.04642" rel="nofollow">https://arxiv.org/abs/2403.04642</a> comparing the MCTS approach to other current RL strategies.

评论 #39665919 未加载

评论 #39668110 未加载

评论 #39665067 未加载

pixelpoet大约 1 年前

Immediately recognise the author in the HN url as the genius behind KataGo: <a href="https://github.com/lightvector/KataGo">https://github.com/lightvector/KataGo</a><p>His posts on <a href="https://www.reddit.com/r/cbaduk/" rel="nofollow">https://www.reddit.com/r/cbaduk/</a> are consistently excellent.

评论 #39663983 未加载

dooglius大约 1 年前

My chess experience is not that high, but enough that I'm skeptical of the claim that the same position would be duplicated in a search tree enough for it to matter. I'd be interested to see an actual measure of this with Leela Zero. (And I'm not even considering the threefold repetition and fifty-move rules, which when considered in the state make a repetition much less likely).

评论 #39669006 未加载

modeless大约 1 年前

> Also, as far as the name "Monte-Carlo Tree Search" itself, readers might note that there is nothing "Monte-Carlo" in the above algorithm - that it's completely deterministic!<p>MCTS as commonly implemented is deterministic? How strange! I assumed there was randomness in the sampling.

评论 #39664996 未加载

评论 #39665073 未加载

评论 #39664946 未加载

rphln大约 1 年前

Somehow the paper they mention completely flew under my radar when I was researching MCTS. Surely it's gonna be a lot of fun to give this modification a spin on my next opportunity.

behnamoh大约 1 年前

A bit introduction about this would be nice.

评论 #39663737 未加载

6 条评论

fabmilo大约 1 年前

评论 #39665919 未加载

评论 #39668110 未加载

评论 #39665067 未加载

pixelpoet大约 1 年前

评论 #39663983 未加载

dooglius大约 1 年前

评论 #39669006 未加载

modeless大约 1 年前

评论 #39664996 未加载

评论 #39665073 未加载

评论 #39664946 未加载

rphln大约 1 年前

Somehow the paper they mention completely flew under my radar when I was researching MCTS. Surely it's gonna be a lot of fun to give this modification a spin on my next opportunity.

behnamoh大约 1 年前

A bit introduction about this would be nice.

评论 #39663737 未加载

Monte-Carlo graph search from first principles

6 条评论

Monte-Carlo graph search from first principles

6 条评论