Researcher, Huawei Technologies Ltd.
2 papers at NeurIPS 2025
We propose a projection-based scoring function in KV cache eviction for LLM acceleration.