전체 글 287

A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods 논문리뷰

https://arxiv.org/html/2502.01618v3#S3 A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo MethodsWe now zoom in on how PF scales with inference-time compute. Figure 2 shows the change of performance (in terms of accuracy) with an increasing computation budget (N=1,2,4,8,16,32,64,128𝑁1248163264128N=1,2,4,8,16,32,64,128italic_N = 1 , 2 , 4 , 8 , ..

카테고리 없음 2025.05.20

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model 요약

3IXC2.5-RewardData PreparationReward models are trained using pairwise preference annotations (e.g., prompts x with chosen responses yc and rejected responses yr) that reflect human preferences. While existing public preference data is primarily textual, with limited image and scarce video examples, we train IXC-2.5-Reward using both open-source data and a newly collected dataset to ensure broad..

카테고리 없음 2025.02.12

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM 내용 요약

2 MethodIn this section, we present our preliminary attempts to adapt MLLMs by equipping them with slow-thinking capacities for complex multimodal tasks. We explore two straightforward adaptation methods: (1) transferring slow-thinking abilities using text-based long thought data, and (2) distilling multimodal long thought data from existing slow-thinking MLLMs. Our aim is to investigate how slo..

카테고리 없음 2025.02.11

insight-V 논문 요약

https://arxiv.org/html/2411.14432v1#S3간단하게 두개의 MLLM 사용 reasoning, summarizationTo fully leverage the reasoning capabilities of MLLMs, we propose Insight-V, a novel system comprising two MLLMs dedicated to reasoning and summarization, respectively. reasoning model - detailed reasoning process 생성summary model - reasoning을 supplementray info 보조적인 정보로 사용해 정답에 대한 relevance utilility 를 평가  3.2 Constru..

카테고리 없음 2025.02.10

forest of thought 논문 요약

https://arxiv.org/html/2412.09078v1#S4 Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM ReasoningForest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning Zhenni Bi    Kai Han    Chuanjian Liu    Yehui Tang    Yunhe Wang Abstract Large Language Models (LLMs) have shown remarkable abilities across various language tasks,arxiv.orgbenchmark : GSM 8k , MATH 3.1 FoT frame..