PhD student, Massachusetts Institute of Technology
2 papers at NeurIPS 2025
We combine two types of memory systems from quadratic and linear transformers into a single hybrid memory system to leverage their complementary strengths in context coverage, precise retrieval, and expressivity.
We develop algorithms that are guaranteed to PAC learn transformers.