MCTS + o1 journey
https://arxiv.org/pdf/2412.15797Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning https://arxiv.org/pdf/2501.01478Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search https://arxiv.org/abs/2501.07301The Lessons of Developing Process Reward Models in Mathematical Reasoning https://github.com/GAIR-NLP/O1-Journeyo1 journey