PhD student, Tsinghua University
1 paper at NeurIPS 2025
We propose an efficient, training-free prompt compression method that retains key information within long inputs using the evaluator heads we identified in transformer-based LLMs.