PhD student, Renmin University of China
2 papers at NeurIPS 2025
We find that excessively scaling Chain of Thought (CoT) length can impair the model's reasoning performance in certain domains, and we propose a Thinking-Optimal Scaling strategy to achieve more effective and efficient test-time scaling.
We propose Learning to Focus (LeaF), which identifies and masks confounding tokens via gradient‐based comparisons, thereby improving long‐context reasoning accuracy and interpretability in large language models.