Associate Professor, Renmin University of China
4 papers at NeurIPS 2025
FlexWorld generates flexible-view 3D scenes from single images using progressive expanding 3D Gaussian splatting and a fine-tuned video-to-video model, outperforming existing methods in quality and exploration flexibility.
We present a theoretical framework that interprets masked diffusion models (MDMs) as solutions to energy minimization problems and an efficient post-training schedule tuning method without model modification.
Scaling Diffusion Transformers up to 18B Efficiently via $\mu$P
We present LLaDA, a diffusion language model trained from scratch that is competitive to LLaMA 3 in performance.