Researcher, Huawei Technologies Ltd.
1 paper at NeurIPS 2025
We propose a projection-based scoring function in KV cache eviction for LLM acceleration.