RLAIF-V assigned trustworthiness score to atomic claims of each candidate responses using open-source MLLM.
SIMA utilizes critic prompt considering various factor to obtain the preference pairs dataset.
VL-feedback uses gpt-4v to assess various decoded responses from other LMM regarding Helpfulness, Visual Faithfulness, Ethical Considerations.
FGAIF trains reward model to assign score to various categories of hallucinations.