MS student, Korea Advanced Institute of Science & Technology
1 paper at NeurIPS 2025
Enhancing cost efficiency in LLM serving through an edge-assisted speculative decoding framework.