PhD student, The Hong Kong University of Science and Technology
1 paper at NeurIPS 2025
A novel alignment framework that integrates generative reward models with multi-modal RLHF.