카테고리 없음

from system 2 관련 논문

jinuklee 2025. 1. 19. 22:43

TEST-TIME ADAPTATION WITH CLIP REWARD FOR ZERO-SHOT GENERALIZATION IN VISION-LANGUAGE MODELS

https://openreview.net/pdf?id=kIP0duasBb

Efficient Test-Time Prompt Tuning for Vision-Language Models

https://arxiv.org/html/2408.05775v1

Self-Generated Critiques Boost Reward Modeling for Language Models

https://arxiv.org/html/2411.16646v2

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

https://arxiv.org/abs/2410.08146

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

https://arxiv.org/html/2410.16033v3

Fast Best-of-N Decoding via Speculative Rejection

https://openreview.net/forum?id=348hfcprUs

Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation

https://arxiv.org/html/2410.02725v1

 

Test-time Computing: from System-1 Thinking to System-2 Thinking

https://arxiv.org/html/2501.02497v1