Assistant Professor, Peking University
1 paper at NeurIPS 2025
We introduce a Functional Scaling Law that predicts full SGD loss dynamics under arbitrary learning rate schedules.