2 papers across 2 sessions
DuSA, a novel dual-stage sparse attention mechanism, further advances the basic single-stage sparse attention mechanisms to match or even outperform the vanilla scaled-dot product attention mechanism in different tasks.
We propose MK-CAViT, a multi-kernel Vision Transformer with HGR-based correlation attention, achieving efficient multi-scale feature learning.