vision language

2 papers across 2 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes

#4713 · Candace Ross, Florian Bordes, Adina Williams, Polina Kirichenko, Mark Ibrahim

we release a cognitively-inspired benchmark for reasoning across scenes that reveals hallucination is an open challenge for multimodal models

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding

#4909 Spotlight · Ansel Blume, Jeonghwan Kim, Hyeonjeong Ha, Elen Chatikyan, Xiaomeng Jin, Khanh Nguyen, Nanyun Peng, Kai-Wei Chang, Derek Hoiem, Heng Ji

Introducing PARTONOMY and PLUM, a new benchmark and segmenting LMM that enable fine-grained, part-level visual reasoning by addressing architectural flaws in existing LMMs and setting a new standard for grounded multimodal understanding.