Full Professor, School of Computer and Communication Sciences, EPFL - EPF Lausanne
3 papers at NeurIPS 2025
The paper proves that a two-layer, single-head transformer can reliably perform in-context learning on any-order Markov chains.
We present conditions that preclude the existence of tight generalization bounds versus a stability condition that guarantees this.