https://arxiv.org/pdf/2407.00320Advancing Process Verification for Large Language Models via Tree-Based Preference Learninghttps://arxiv.org/abs/2407.00390Monte Carlo Tree Searchthey often require more than 10 times the computational resources of greedy decoding due to wasteful search strategies, making them difficult to be deployed in practical applications.Results show that our methods offer c..