카테고리 없음

간단한 요약 : multimodal self-train

jinuklee 2025. 1. 24. 04:39

RLAIF-V assigned trustworthiness score to atomic claims of each candidate responses using open-source MLLM. 

 

SIMA  utilizes critic prompt considering various factor to obtain the preference pairs dataset. 

 

VL-feedback uses gpt-4v to assess various decoded responses from other LMM regarding Helpfulness, Visual Faithfulness, Ethical Considerations. 

 

FGAIF  trains reward model to assign score to various categories of hallucinations.