카테고리 없음
간단한 요약 : multimodal self-train
jinuklee
2025. 1. 24. 04:39
RLAIF-V assigned trustworthiness score to atomic claims of each candidate responses using open-source MLLM.
SIMA utilizes critic prompt considering various factor to obtain the preference pairs dataset.
VL-feedback uses gpt-4v to assess various decoded responses from other LMM regarding Helpfulness, Visual Faithfulness, Ethical Considerations.
FGAIF trains reward model to assign score to various categories of hallucinations.