today local_bar

Diyuan Wu

PhD student, Institute of Science and Technology Austria

1 paper at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Attention with Trained Embeddings Provably Selects Important Tokens

#3909 · Diyuan Wu, Aleksandr Shevchenko, Samet Oymak, Marco Mondelli

We characterize the structure of embeddings obtained via gradient descent, showing that the attention mechanism provably selects important tokens.