4 papers across 2 sessions
We introduce TiledFlashLinearAttention a faster kernel algorithm for Linear RNNs and mLSTMs by improved Sequence Parallelism.
We extend DeltaNet by using products of householders as state-transition matrices allowing us to trade-off expressivity and computational complexity.
We introduce the Fixed-Point RNN framework to solve state-tracking tasks by parameterizing the state transition matrix as implicitly dense.
Enhancing linear RNNs to multi-dimensional structures, stable and parallelizable.