prompt compression

1 paper across 1 session

Poster Session 5

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference

#3508 Spotlight · Weizhi Fei, Xueyan Niu, XIE GUOQING, Yingqing Liu, Bo Bai, Wei Han

We propose an efficient, training-free prompt compression method that retains key information within long inputs using the evaluator heads we identified in transformer-based LLMs.