1 paper across 1 session
We explore a training-free confidence-aware fusion framework that adaptively combines features from SD3, SD, and DINO for zero-shot semantic matching.