Undergraduate researcher, Massachusetts Institute of Technology
1 paper at NeurIPS 2025
Bipartite mutual information in natural text exhibits sub-volume growth; from this, we prove a lower bound on how the history state must scale, setting a necessary condition for architectures to be effective at long-context language modeling.