Research Group Leader, Institute for AI in Medicine IKIM
1 paper at NeurIPS 2025
We use grokking to disentangle generalization from training dynamics and show that relative flatness, not neural collapse, is a necessary and more predictive indicator of generalization in deep networks.