PhD student, University of Science and Technology of China
1 paper at NeurIPS 2025
We propose a projection-based scoring function in KV cache eviction for LLM acceleration.