MS student, Shanghai Jiaotong University
1 paper at NeurIPS 2025
We propose a universal video grounding model based on MLLMs, which achieves superior accuracy, generalizability, and robustness.