5 papers across 3 sessions
We present a unified audio-visual framework to uncover how humans and AI respond to modality conflicts and bias.