Associate Professor, Kyoto University
2 papers at NeurIPS 2025
We rigorously identify the infinite–width limit distribution of neurons within a single attention layer under realistic architectural dimensionality
We prove that transformers can achieve nearly optimal dynamic regret bounds in non-stationary environments.