'2024/10 글 목록

VARIATIONAL BEST-OF-N ALIGNMENT 논문리뷰

https://arxiv.org/pdf/2407.06057

카테고리 없음 2024.10.29

Inferaligner 논문리뷰: Inference-time alignment for harmlessness through cross-model guidance, 2024

https://arxiv.org/abs/2401.11206 InferAligner: Inference-Time Alignment for Harmlessness through Cross-Model GuidanceWith the rapid development of large language models (LLMs), they are not only used as general-purpose AI assistants but are also customized through further fine-tuning to meet the requirements of different applications. A pivotal factor in the success of carxiv.orgAbstract With th..

카테고리 없음 2024.10.29

Fast Best-of-N Decoding via Speculative Rejection

https://arxiv.org/pdf/2410.20290The safe and effective deployment of Large Language Models (LLMs) involves a critical step called alignment, which ensures that the model’s responses are in accordance with human preferences. Prevalent alignment techniques, such as DPO, PPO and their variants, align LLMs by changing the pre-trained model weights during a phase called post-training. While predomina..

카테고리 없음 2024.10.29

armo RM 논문리뷰 Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

https://arxiv.org/pdf/2406.12845https://github.com/RLHFlow/RLHF-Reward-Modeling Reinforcement learning from human feedback (RLHF) has emerged as the primary method for aligning large language models (LLMs) with human preferences. The RLHF process typically starts by training a reward model (RM) using human preference data. Conventional RMs are trained on pairwise responses to the same user reque..

카테고리 없음 2024.10.29

MAVIS: Mathematical Visual Instruction Tuning 논문리뷰

https://arxiv.org/pdf/2407.08739 Multi-modal Large Language Models (MLLMs) have recently emerged as a significant focus in academia and industry. Despite their proficiency in general multi-modal scenarios, the mathematical problem-solving capabilities in visual contexts remain insufficiently explored. We identify three key areas within MLLMs that need to be improved: visual encoding of math diag..

multi-step reasoning(수학, 코딩, 계획)/멀티모달 cot 2024.10.25

IMPROVE VISION LANGUAGE MODEL CHAIN-OFTHOUGHT REASONING 논문리뷰

https://arxiv.org/pdf/2410.16198https://github.com/RifleZhang/LLaVA-Reasoner-DPO GitHub - RifleZhang/LLaVA-Reasoner-DPOContribute to RifleZhang/LLaVA-Reasoner-DPO development by creating an account on GitHub.github.comChain-of-thought (CoT) reasoning in vision language models (VLMs) is crucial for improving interpretability and trustworthiness. However, current training recipes lack robust CoT r..

multi-step reasoning(수학, 코딩, 계획)/멀티모달 cot 2024.10.25

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

https://arxiv.org/abs/2406.12624

카테고리 없음 2024.10.24

Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking

https://arxiv.org/abs/2409.15268

카테고리 없음 2024.10.24

llm-as-a-judge related work

2.3 RLAIF AND LLM-AS-A-JUDGE Reinforcement Learning from AI Feedback (RLAIF) presents an alternative approach to the standard RLHF pipeline. Bai et al. (2022b) demonstrate the efficacy of RLAIF in training helpful and harmless models without relying on human feedback labels for harmlessness assessment. Their work shows that as language model capabilities improve, AI identification of harms incre..

카테고리 없음 2024.10.23

UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs 논문리뷰

https://arxiv.org/abs/2410.17050 UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMsThe key components of machine learning are data samples for training, model for learning patterns, and loss function for optimizing accuracy. Analogously, unlearning can potentially be achieved through anti-data samples (or anti-samples), unlearning methodarxiv.orgThe key components of machine lear..

카테고리 없음 2024.10.23

이진욱님의 블로그

2024/10 60

티스토리툴바

« 2024/10 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31