1 paper across 1 session
We propose a novel DP-driven multimodal learning framework that automatically optimizes the balance between prominent intra-modal representation learning and cross-modal alignment.