Full Professor, Tsinghua University, Tsinghua University
2 papers at NeurIPS 2025
We systematically examine the current state of RLVR and surprisingly find that it does not elicit fundamentally new reasoning patterns—revealing a gap between the potential of RL and the actual impact of current RLVR methods.