https://arxiv.org/abs/2407.01476
best first tree search 추론 알고리즘
웹 자동화 프로세스와 같은 decision-making tasks에서의 agent
multi-step reasoning, planning, 환경으로 받은 피드백 사용 등 몇몇 task시에 아직 발전해야함
exploration, multi-step planning을 웹 환경에서 inference-time search 알고리즘를 통해 수행
예시)WebArena (Zhou et al., 2024b) ,VisualWebArena (Koh et al., 2024) 같은 벤치마크에서 인간은 78%,89% 성능을 달성하지만 agent는 현저히 낮음
이유
One significant bottleneck in existing agents arises from their inability to leverage test-time computation for exploration and multi-step planning
Search and planning is especially important in open ended web environments, as the potential action space (i.e., all possible actions one can take on a webpage) is much larger than in most video games or text-based simulators
one effective strategy for leveraging test-compute to improve results is search: iteratively constructing, exploring, and pruning a graph of intermediate states and possible solutions