Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models 논문리뷰

카테고리 없음

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models 논문리뷰

jinuklee 2024. 8. 16. 23:24

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring addi

arxiv.org

preliminary