Associate Professor, University of Oxford
2 papers at NeurIPS 2025
We provide the first proof showing that pause tokens (such as "...") appended to the input of a Transformer can strictly increase its expressivity.