PhD student, The Hong Kong Polytechnic University
1 paper at NeurIPS 2025
A SoTA sequence parallelism for linear attention with a brand new collective communication.