Researcher, Alibaba Group
1 paper at NeurIPS 2025
FPSAttention is a training-aware FP8 quantization and sparsity co-design for video diffusion models that achieves up to 4.96× speedup without quality loss by aligning 3D tile granularity, denoising-step adaptation, and hardware-efficient kernels.