이진욱님의 블로그
홈
태그
방명록
빅테크 리포트
LLM
멀티모달
디퓨전 모델
inference-time, RLHF/STaR, ResT - LMM
TLDR: Token-Level Detective Reward Model forLarge Vision Language Models 논문리뷰
jinuklee
2024. 10. 12. 18:17
https://arxiv.org/pdf/2410.04734
공유하기
게시글 관리
이진욱님의 블로그
'
inference-time, RLHF
>
STaR, ResT - LMM
' 카테고리의 다른 글
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
(0)
2024.10.13
Enhancing visual-language modality alignment in large vision language models via self-improvement 논문리뷰
(0)
2024.10.13
FGAIF: Aligning Large Vision-Language Modelswith Fine-grained AI Feedback 논문리뷰
(1)
2024.10.12
GLOV: GUIDED LARGE LANGUAGE MODELS AS IMPLICIT OPTIMIZERS FOR VISION LANGUAGE MODELS 논문리뷰
(0)
2024.10.12
LMM의 DPO : Aligning Modalities in Vision Large Language Models via Preference Fine-tuning 논문리뷰
(0)
2024.10.09
티스토리툴바
이진욱님의 블로그
구독하기