'inference-time, RLHF/scalable oversight' 카테고리의 글 목록

On scalable oversight with weak LLMs judgingstrong LLMs 논문 리뷰

https://arxiv.org/pdf/2407.04622출발점두 AI 사이의 토론을 통해 judge model에 올바른 대답을 선택하게 한다는 아이디어( AI safety via debate arxiv)에서 출발토론에서의 nash equilibria 와 같이 두 AI 모두 가장 convincing(설득력 잇는) 방식으로 judge(심판) AI에게 진실을 말할 것이라는 hope1. Introduction유형 1. Extractive질문과 그에 따른 답변 선택지 2개, 그리고 원본 source article하지만 judge model can't see the article -> information-asymmetry2. closed질문과 그에 따른 답변 선택지 2개만 존재3. multimodal 이미지 포..

inference-time, RLHF/scalable oversight 2024.07.22

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

이진욱님의 블로그

inference-time, RLHF/scalable oversight 1

티스토리툴바