Associate Professor, Tsinghua University
5 papers at NeurIPS 2025
DIET makes LLMs more token-efficient by using problem difficulty to dynamically guide compression during reinforcement learning, boosting reasoning performance and enabling superior inference scaling under fixed budgets.
We propose Learning to Focus (LeaF), which identifies and masks confounding tokens via gradient‐based comparisons, thereby improving long‐context reasoning accuracy and interpretability in large language models.