PhD student, Weizmann Institute of Science
2 papers at NeurIPS 2025
We prove that under appropriate conditions, a single-head softmax attention mechanism exhibits benign overfitting
We prove that, under appropriate conditions, linear attention is an almost optimal metalearner for linear classification.