1 paper across 1 session
We propose an efficient, training-free prompt compression method that retains key information within long inputs using the evaluator heads we identified in transformer-based LLMs.