분류 전체보기 286

LMM-as-a-judge / PROMETHEUS-VISION:Vision-Language Model as a Judge for Fine-Grained Evaluation 논문리뷰

https://arxiv.org/pdf/2401.06591Assessing long-form responses generated by Vision-Language Models (VLMs) is challenging. It not only requires checking whether the VLM follows the given instruction but also verifying whether the text output is properly grounded on the given image. Inspired by the recent approach of evaluating LMs with LMs, in this work, we propose to evaluate VLMs with VLMs. For ..

Calibrated Self-Rewarding Vision Language Models 논문리뷰

https://arxiv.org/pdf/2405.14622요약reward 부여 방식: self-generated instruction-following score( calculated using the language decoder of the LVLM , 이거 하나로만 안되는 이유 : modality misalignment, potentially overlooking visual input information ), + the image-response relevance score, R^I (s).( We leverage CLIP-score [17] for this calculation ) 3 Calibrated Self-Rewarding Vision Language Models To address t..

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

https://arxiv.org/pdf/2404.01258v2 Preference modeling techniques, such as direct preference optimization (DPO), has shown effective in enhancing the generalization abilities of large language model (LLM) However, in tasks involving video instruction following, providing informative feedback, especially for detecting hallucinations in generated responses, remains a significant challenge Previous..

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling 논문리뷰

https://arxiv.org/pdf/2408.16737Training on high-quality synthetic data from strong language models (LMs) is a common strategy to improve the reasoning performance of LMs. In this work, we revisit whether this strategy is computeoptimal under a fixed inference budget (e.g., FLOPs). To do so, we investigate the trade-offs between generating synthetic data using a stronger but more expensive (SE) ..

카테고리 없음 2024.10.03