39 点作者 roboboffin4 个月前

5 条评论

s-macke4 个月前

> Notably, no self-reflection training data or prompt was included, suggesting that advanced System 2 reasoning can foster intrinsic self-reflection.<p>They suggest, that self-reflection is an emergent phenomena of reasoning. Impressive. Can't wait to see the code.

throwaway815234 个月前

Abstract is impressive. I'm surprised this post hasn't gotten more attention.

评论 #42653983 未加载

helltone4 个月前

Off topic but how is MCTS usually implemented efficiently? It has a branching structure that doesn't seem parallelizable (GPU).

fabmilo4 个月前

I was just about to submit this link and redirected me to this page. I am shocked that it received only four comments. If you are working in the LLMs/Agent space ( you are, right?) and you don't understand the significance of this paper, you are set for failure.

dantodor4 个月前

The repo gives 404?

评论 #42654473 未加载

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

5 条评论

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

5 条评论