logo
today local_bar
Poster Session 1 · Wednesday, December 3, 2025 11:00 AM → 2:00 PM
#5405

JAFAR: Jack up Any Feature at Any Resolution

NeurIPS OpenReview

Abstract

Foundation Vision Encoders have become indispensable across a wide range of dense vision tasks. However, their operation at low spatial feature resolutions necessitates subsequent feature decompression to enable full-resolution processing.
To address this limitation, we introduce JAFAR, a lightweight and flexible feature upsampler designed to enhance the spatial resolution of visual features from any Foundation Vision Encoder to any target resolution.
JAFAR features an attention-based upsampling module that aligns the spatial representations of high-resolution queries with semantically enriched low-resolution keys via Spatial Feature Transform modulation. Despite the absence of high-resolution feature ground truth; we find that learning at low upsampling ratios and resolutions generalizes surprisingly well to much higher scales.
Extensive experiments demonstrate that JAFAR recovers intricate pixel-level details and consistently outperforms existing feature upsampling techniques across a diverse set of dense downstream applications.