FlashAttention - NeurIPS 2025

today local_bar

FlashAttention

2 papers across 1 session

Poster Session 3

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

#3509 · Maximilian Beck, Korbinian Pöppel, Phillip Lippe, Sepp Hochreiter

We introduce TiledFlashLinearAttention a faster kernel algorithm for Linear RNNs and mLSTMs by improved Sequence Parallelism.

FlashBias: Fast Computation of Attention with Bias

#3417 · Haixu Wu, Minghao Guo, Yuezhou Ma, Yuanxu Sun, Jianmin Wang, Wojciech Matusik, Mingsheng Long

This paper presents FlashBias to speed up computation of attention with bias, which brings 1.5x speedup for AlphaFold and 2x speedup for SwinV2.