PhD student, Shanghai Jiaotong University
1 paper at NeurIPS 2025
In this paper, we propose a novel Cascade Adaptive Self-Speculative Decoding (CAS-Spec) algorithm which constructs speculative draft models by leveraging dynamically switchable inference acceleration (DSIA) strategies