1 paper across 1 session
This paper proposes a Cross-Modal Schrödinger Bridge to align the domain-specific imges to the domain-invariant text, so as to enhance generalization to unseen domains.