2 papers across 1 session
We propose a projection-based scoring function in KV cache eviction for LLM acceleration.