PhD student, Huazhong University of Science and Technology
3 papers at NeurIPS 2025
We present ARGenSeg, a unified framework for multimodal understanding and pixel-level perception, achieving state-of-the-art performance of image segmentation.
This study introduces VADB, the largest video aesthetic database with 10,490 videos, and VADB-Net, a novel framework that outperforms existing models in video aesthetic assessment.