Full Professor, Sapienza University of Rome
1 paper at NeurIPS 2025
Attention sink in LLMs serves as geometric reference frames that anchor token representations in high-dimensional space, emerging during training as optimal solutions to the coordinate system problem, shaped by architecture and position encodings.