Researcher, University of Adelaide/Australian Institute of Machine Learning
2 papers at NeurIPS 2025
We propose spectral conditioning of attention layers to improve Jacobian conditioning, leading to more stable and efficient optimization with negligible computational overhead and consistent gains across diverse transformer architectures.