Researcher, Alibaba Group
2 papers at NeurIPS 2025
A sparse attention mechanism balances efficiency, long-range random access flexibility and length generalization ability