Principal Researcher, Saarland Informatics Campus, Max-Planck Institute for Informatics
2 papers at NeurIPS 2025
We interpret attention as discrete-time markov chains and show its effectiveness on various downstream tasks.