Full Professor, University of California Berkeley
3 papers at NeurIPS 2025
We use random matrix theory to estimate the spectral density of matrices too large to fit into memory.
We characterize key factors for good synthetic priors and build a SOTA tabular foundation model by pretraining on these priors.
Accelerating attention for long-context reasoning by identifying and loading important tokens and by approximating attention to less important tokens