Full Professor, Massachusetts Institute of Technology
2 papers at NeurIPS 2025
Unify supervised & reinforcement fine-tuning, and outperforms both of them. Together with theoretical justifications.