MS student, Brown University
2 papers at NeurIPS 2025
We show that self-supervised models like DINOv2 can develop strong noise robustness without any explicit denoiser at downstream fine-tuning or inference, by leveraging a data curriculum and a denoised regularization loss during pretraining.
In this work, we show that vision foundation models such as DINOv2 can achieve fast convergence and maintain high robustness by applying data curriculum and integrating data augmentation in the frequency domain during pretraining.