Chunyu Wang

Researcher, Microsoft Research Asia

1 paper at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

#4604 · Yibin Wang, Zhimin Li, Yuhang Zang, Chunyu Wang, Qinglin Lu, Cheng Jin, Jiaqi Wang

This paper the first unified multimodal CoT-based reward model, capable of multi-dimensional, step-by-step long-chain reasoning for both visual understanding and generation reward tasks.