Intern, Shanghai AI Laboratory
1 paper at NeurIPS 2025
We systematically examine the current state of RLVR and surprisingly find that it does not elicit fundamentally new reasoning patterns—revealing a gap between the potential of RL and the actual impact of current RLVR methods.