TEST-TIME ADAPTATION WITH CLIP REWARD FOR ZERO-SHOT GENERALIZATION IN VISION-LANGUAGE MODELS
https://openreview.net/pdf?id=kIP0duasBb
Efficient Test-Time Prompt Tuning for Vision-Language Models
https://arxiv.org/html/2408.05775v1
Self-Generated Critiques Boost Reward Modeling for Language Models
https://arxiv.org/html/2411.16646v2
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
https://arxiv.org/abs/2410.08146
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
https://arxiv.org/html/2410.16033v3
Fast Best-of-N Decoding via Speculative Rejection
https://openreview.net/forum?id=348hfcprUs
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation
https://arxiv.org/html/2410.02725v1
Test-time Computing: from System-1 Thinking to System-2 Thinking