Full Professor, Tianjin University
2 papers at NeurIPS 2025
We makes an early attempt at fully unsupervised LLM reasoning incentivization.
We proposes a new Distributional Test-time Adaptation (DOTA) method, which continuously estimates the distribution of test samples.