1 paper across 1 session
We prove that, under appropriate conditions, linear attention is an almost optimal metalearner for linear classification.