1 paper across 1 session
We propose GRIFFIN to accelerate the inference speed of LLM by addressing the token misalignment issue in speculative decoding.