ARGS: Alignment as Reward-Guided Search 논문리뷰
https://arxiv.org/abs/2402.01694 ARGS: Alignment as Reward-Guided SearchAligning large language models with human objectives is paramount, yet common approaches including RLHF suffer from unstable and resource-intensive training. In response to this challenge, we introduce ARGS, Alignment as Reward-Guided Search, a novel framearxiv.orgAligning large language models with human objectives is para..