Professor, Princeton University
3 papers at NeurIPS 2025
Recurrent neural networks spontaneously model partners during collaboration — without specialised architectures — but only when partner-specific adaptation improves task performance.
The paper proposes Causal Head Gating, a scalable, unsupervised method to classify transformer attention heads by causal impact on task performance that reveal task-specific sub-circuits.