Full Professor, Tsinghua University
5 papers at NeurIPS 2025
This paper presents a systematic pipeline for improving video generation with human feedback, including a large-scale preference dataset, a video reward model, and three alignment algorithms for flow matching models.
We propose a general reinforcement learning framework tailored for interleaved multimodal tasks by permutating image sequences to simulate varied positional relationships and explore more spatial and positional diversity