Researcher, Google DeepMind
1 paper at NeurIPS 2025
We show that adapting vision foundation models using self-supervised fine-tuning with simple object-centric videos substantially improves representation quality without labels.