PhD student, Shanghai Jiao Tong University
4 papers at NeurIPS 2025
We introduce a novel inference-time steering methodology called Reinforcing Cognitive Experts (RICE), designed to improve reasoning depth and efficiency without additional training or complex heuristics.
Unsupervised Prefix Fine-Tuning Method for Reasoning Models
We introduce a RL framework to train LLM's reasoning and self-verification ability simultaneously.