VHElM 논문리뷰 a holistic visual evaluation of vlm
https://arxiv.org/abs/2410.07112Current benchmarks for assessing vision-language models (VLMs) often focus on their perception or problem-solving capabilities and neglect other critical aspects such as fairness, multilinguality, or toxicity.Furthermore, they differ in their evaluation procedures and the scope of the evaluation, making it diff i cult to compare models. To address these issues, we..