Researcher, Stepfun
1 paper at NeurIPS 2025
Exploration of rule-based reinforcement learning (RL) in MLLM post-training for perception policy learning.