Full Professor, Stanford University
1 paper at NeurIPS 2025
Large neural networks first learn low dimensional feature representation then overfit the data and revert to a kernel regime.