https://arxiv.org/abs/2410.21728
Let me help you add line breaks between sentences from the abstract:
Teaching large language models (LLMs) to generate text with citations to evidence sources can mitigate hallucinations and enhance verifiability in information-seeking systems.
However, improving this capability requires highquality attribution data, which is costly and labor-intensive.
Inspired by recent advances in self-improvement that enhance LLMs without manual annotation, we present START, a Self-Taught AttRibuTion framework for iteratively improving the attribution capability of LLMs.
First, to prevent models from stagnating due to initially insufficient supervision signals, START leverages the model to self-construct synthetic training data for warming up.
To further improve the model's attribution ability, START iteratively utilizes fine-grained preference supervision signals constructed from its sampled responses to encourage robust, comprehensive, and attributable generation.
Experiments on three open-domain questionanswering datasets, covering long-form QA and multi-step reasoning, demonstrate significant performance gains of 25.13% on average without relying on human annotations and more advanced models.
Further analysis reveals that START excels in aggregating information across multiple sources.
[Section 1: Introduction]
The rapid development of large language models (LLMs) (OpenAI, 2023; Zhao et al., 2023) has led to their prosperity as indispensable tools for information seeking.
Despite their remarkable capability to generate fluent and informative responses to user queries, LLMs also struggle with hallucinations (Huang et al., 2023).
[Continue with rest of text?]