PhD student, University of Montreal
1 paper at NeurIPS 2025
We provide a method for enabling length generalization within state-space models by modulating the $A$ matrices per layer.