PhD student, City University of Hong Kong
1 paper at NeurIPS 2025
DuSA, a novel dual-stage sparse attention mechanism, further advances the basic single-stage sparse attention mechanisms to match or even outperform the vanilla scaled-dot product attention mechanism in different tasks.