7 papers across 3 sessions
We introduce an alias‑free ViT that combines anti‑aliasing with linear cross‑covariance attention to achieve fractional shift invariance, delivering ~99% consistency to sub‑pixel shifts and stronger translation robustness with competitive accuracy.
We propose LION, a framework for extending Linear Transformers to the bidirectional setting by providing three theoretically equivalent representations: full attention, bidirectional RNN, and chunkwise parallel form.
A SoTA sequence parallelism for linear attention with a brand new collective communication.
We introduce TiledFlashLinearAttention a faster kernel algorithm for Linear RNNs and mLSTMs by improved Sequence Parallelism.
We extend DeltaNet by using products of householders as state-transition matrices allowing us to trade-off expressivity and computational complexity.
We introduce the Fixed-Point RNN framework to solve state-tracking tasks by parameterizing the state transition matrix as implicitly dense.