카테고리 없음

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data 논문리뷰

jinuklee 2024. 8. 19. 18:23

https://arxiv.org/abs/2404.14367