Postdoc, National University of Singapore
1 paper at NeurIPS 2025
We analyze the flow of tokens across attention layers and use these insights to enhance performance of Transformers.