Researcher, Google
3 papers at NeurIPS 2025
We analyze the parameter bounds for robust memorization as a function of the robustness ratio.
We theoretically investigate weak-to-strong generalization from a linear CNN to a two-layer ReLU CNN
We show that Schedule-Free methods effectively navigate the river structure of the loss landscape, enabling scalable language model training without decay schedules or extra memory.