Alexander G. Gray

VP, Foundations of AI, International Business Machines

1 paper at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Transformers Learn Faster with Semantic Focus

#3506 · Parikshit Ram, Kenneth L. Clarkson, Tim Klinger, Shashanka Ubaru, Alexander G. Gray

We demonstrate scenarios where sparse attention based transformer models learn and generalize faster, and theoretically characterize conditions under which this occurs.