1 paper across 1 session
We introduce TiledFlashLinearAttention a faster kernel algorithm for Linear RNNs and mLSTMs by improved Sequence Parallelism.