MMHAL-BENCH : ALIGNING LARGE MULTIMODAL MODELSWITH FACTUALLY AUGMENTED RLHF 논문리뷰
https://arxiv.org/pdf/2309.14525 Large Multimodal Models (LMM) are built across modalities and the misalign-ment between two modalities can result in “hallucination”, generating textual out-puts that are not grounded by the multimodal information in context. To address the multimodal misalignment issue, we adapt the Reinforcement Learning from Human Feedback (RLHF) from the text domain to the ta..