'2025/01/24 글 목록

2025/01/24 5

VisVM : Scaling Inference-Time Search with Vision Value Modelfor Improved Visual Comprehension

https://arxiv.org/pdf/2412.03704v2

inference-time, RLHF/search (multimodal) 2025.01.24

Can We Generate Images with CoT?Let’s Verify and Reinforce Image Generation Step by Step

https://arxiv.org/html/2501.13926v1

multi-step reasoning(수학, 코딩, 계획)/멀티모달 cot 2025.01.24

Imagine while Reasoning in Space:Multimodal Visualization-of-Thought

https://arxiv.org/html/2501.07542v1

multi-step reasoning(수학, 코딩, 계획)/멀티모달 cot 2025.01.24

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

https://arxiv.org/abs/2411.10440

multi-step reasoning(수학, 코딩, 계획)/멀티모달 cot 2025.01.24

간단한 요약 : multimodal self-train

RLAIF-V assigned trustworthiness score to atomic claims of each candidate responses using open-source MLLM. SIMA utilizes critic prompt considering various factor to obtain the preference pairs dataset. VL-feedback uses gpt-4v to assess various decoded responses from other LMM regarding Helpfulness, Visual Faithfulness, Ethical Considerations. FGAIF trains reward model to assign score to va..

카테고리 없음 2025.01.24

이진욱님의 블로그

ai research memo for reference

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

2025/01/24 5

티스토리툴바