Full Professor, Sorbonne Université
2 papers at NeurIPS 2025
We propose a novel architecture and training objective specifically designed to upsample features from foundation vision encoders at any resolution.
Perform input dependent steering of MLLMs by learning to predict steering vectors from input context