Full Professor, City University of Hong Kong
2 papers at NeurIPS 2025
DuSA, a novel dual-stage sparse attention mechanism, further advances the basic single-stage sparse attention mechanisms to match or even outperform the vanilla scaled-dot product attention mechanism in different tasks.