PhD student, University of Chicago
1 paper at NeurIPS 2025
We propose a co-evolving reinforcement learning method that jointly optimizes the coder and unit tester without relying on ground-truth code supervision.