Researcher, ByteDance Inc.
1 paper at NeurIPS 2025
TempSamp-R1 is a reinforcement fine-tuning framework that integrates off-policy supervision, soft advantage shaping, and hybrid Chain-of-Thought training to enhance the temporal grounding capabilities of MLLMs.