분류 전체보기 250

LANGUAGE MODELS ARE HIDDEN REASONERS:UNLOCKING LATENT REASONING CAPABILITIES VIASELF-REWARDING

https://arxiv.org/pdf/2411.04282Here's the text with each sentence on a new line: Large language models (LLMs) have shown impressive capabilities, but still struggle with complex reasoning tasks requiring multiple steps. While prompt-based methods like Chain-of-Thought (CoT) can improve LLM reasoning at inference time, optimizing reasoning capabilities during training remains challenging. We int..

카테고리 없음 2024.11.16

Large Language Models Can Self-Improve in Long-context Reasoning

https://arxiv.org/pdf/2411.08147Large language models (LLMs) have achieved substantial progress in processing long contexts but still struggle with long-context reasoning. Existing approaches typically involve fine-tuning LLMs with synthetic data, which depends on annotations from human experts or advanced models like GPT-4, thus restricting further advancements. To address this issue, we invest..

카테고리 없음 2024.11.16