Postdoc, McGill University
1 paper at NeurIPS 2025
We provide a method for enabling length generalization within state-space models by modulating the $A$ matrices per layer.