Associate Professor, KAIST
3 papers at NeurIPS 2025
We introduce MGAudio, a flow-based framework for video-to-audio generation that leverages model-guided dual-role alignment to achieve state-of-the-art performance.
We introduce Audio-Visual Contrastive Decoding (AVCD), a training-free framework for mitigating hallucinations in AV-LLMs by reformulating the existing contrastive decoding framework to support trimodal interactions.