'분류 전체보기' 카테고리의 글 목록 (16 Page)

분류 전체보기 290

INTERNVIDEO2: SCALING FOUNDATION MODELS FORMULTIMODAL VIDEO UNDERSTANDING 논문리뷰

https://arxiv.org/pdf/2403.15377

VLM 2024.09.30

VideoPrism: A Foundational Visual Encoder for Video Understanding

https://arxiv.org/pdf/2402.13217

VLM 2024.09.30

How to Train Your Fact Verifier:Knowledge Transfer with Multimodal Open Models 논문리뷰

https://arxiv.org/pdf/2407.00369 Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content.

카테고리 없음 2024.09.26

large langauge monkey 논문리뷰

unit tests, proof checkers, majority voting를 verifier로 써서 inference scaling law를 연구

카테고리 없음 2024.09.25

RLHFworkflow 논문리뷰

https://arxiv.org/pdf/2405.07863

카테고리 없음 2024.09.21

Qwen2-VL: Enhancing Vision-Language Model’s Perceptionof the World at Any Resolution

VLM 2024.09.21

When can llms actually correct their own mistakes?a critical survey of self-correction of llms

https://arxiv.org/pdf/2406.01297Self-correction is an approach to improving responses from large language models (LLMs) by refining the responses using LLMs during inference. Prior work has proposed various self-correction frameworks using different sources of feedback, including self-evaluation and external feedback. However, there is still no consensus on the question of when LLMs can correct ..

카테고리 없음 2024.09.21

Training Language Models to Self-Correction viaReinforcement Learning (SCoRe), 논문리뷰

https://arxiv.org/pdf/2409.12917point : 모델의 distribution에서 가능한 가장 좋은 final answer을 생성해내기 위함 + 모델 collapse를 막기완전히 스스로 생성한 데이터를 통해 self-correct 능력을 향상방식스스로 생성한 데이터 - distribution mistmatch 회피두단계로 훈련 stage - minimal edit strategy 의 실패 경우의 모델 collapse 회피를 위한것 (STaR) LLM의 self correction 능력은 비효율적이다 (e.g llm cannot self correct yet 논문)Existing approach는 self-correct을 위해 여러개의 모델, more capable LLM, 혹은 ..

카테고리 없음 2024.09.21

Semi-Supervised Reward Modeling via Iterative Self-Training

x = question , a1 = 첫번째 대답, a2 = 두번째 대답y = a1, a2 중에 뭐가 좋은지 = pseudo-labelingwe only select those data where the model exhibits high confidence

카테고리 없음 2024.09.21

Slic-hf: Sequence likelihood calibration with human feedback 논문리뷰

https://arxiv.org/abs/2305.10425

카테고리 없음 2024.09.21

1 ··· 13 14 15 16 17 18 19 ··· 29

이진욱님의 블로그

ai research memo for reference

Today :
Yesterday :

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

분류 전체보기 290

티스토리툴바