Full Professor, Institute of Science and Technology Austria
2 papers at NeurIPS 2025
We prove that neural collapse is approximately optimal in deep regularized ResNets and transformers end-to-end.
We characterize the structure of embeddings obtained via gradient descent, showing that the attention mechanism provably selects important tokens.