inference-time, RLHF/search (language)

forest of thought 논문 요약

jinuklee 2025. 2. 10. 00:56

https://arxiv.org/html/2412.09078v1#S4

 

Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning Zhenni Bi    Kai Han    Chuanjian Liu    Yehui Tang    Yunhe Wang Abstract Large Language Models (LLMs) have shown remarkable abilities across various language tasks,

arxiv.org

benchmark : GSM 8k , MATH

 

3.1 FoT framework

suppose n개의 트리 

inital root represents initial state or input problem

 

Sparse Activation.

most relevant tree 가 선택됨

 

tree 마다 layer의 node 들 중 high score node만을 선택해 expand 하는 구조

 

If the nodes at a certain level of the tree cannot produce valid outputs, the tree’s splitting process will terminate early, and the activation indicator value will be set to 0.

 

valid 경우에는 특정 깊이까지 expand되다 종료, indicator value 는 1

 

문제해결과 이해를 돕기 위해 model’s extensive knowledge base 에서 그에 맞는 텍스트를 가져옴 그런 다음 concatonation

3.2 Dynamic Self-Correction Strategy

흔히 있는 점수 낮은 trajectory에 대한 correction mechanism

3.3 Decision Making Strategy

Consensus-Guided Expert Decision (CGED) strategy 

voting 한 정답에 대한 math expert의 accuracy

 

만약 결과들이 inconsistent 할시 LLM expert가 이에 대한 error examine