today local_bar

S Kevin Zhou

Full Professor, University of Science and Technology of China

1 paper at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference

#1905 · Yuan Feng, Junlin Lv, Yukun Cao, Xike Xie, S Kevin Zhou

Optimize KV cache eviction by adaptively allocating budgets across different attention heads for efficient LLM inference