PhD student, Massachusetts Institute of Technology
1 paper at NeurIPS 2025
Through theoretical models and empirical testbeds, we characterize the algorithmic tradeoff between privileged expert distillation and RL, and better options for expert distillation.