PhD student, University of Science and Technology of China
1 paper at NeurIPS 2025
A novel benchmark using a comprehensive preference dataset to evaluate multimodal judges across multiple key perspectives