Undergrad student, Fudan University
1 paper at NeurIPS 2025
We establish the universal approximation capability of single-layer, single-head self- and cross-attention mechanisms