University of Chinese Academy of Sciences, Beijing

🏛 University of Chinese Academy of Sciences, Beijing

1 paper across 1 session

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization

#3418 · Yize Wu, KE GAO, Ling Li, Yanjun Wu

This paper presents a layer-parallel speculation strategy for optimizing the effiency of multi-GPU utilization during the drafting stage of speculative decoding.