1 paper across 1 session
We present Neural Attention Search (NAtS), an end-to-end learnable sparse transformer