PhD student, National University of Singapore
1 paper at NeurIPS 2025
We propose GRIFFIN to accelerate the inference speed of LLM by addressing the token misalignment issue in speculative decoding.