PhD student, Peking University
3 papers at NeurIPS 2025
We propose a co-evolving reinforcement learning method that jointly optimizes the coder and unit tester without relying on ground-truth code supervision.