PhD student, Fudan University
1 paper at NeurIPS 2025
We propose GRIFFIN to accelerate the inference speed of LLM by addressing the token misalignment issue in speculative decoding.