1 paper across 1 session
Introducing PARTONOMY and PLUM, a new benchmark and segmenting LMM that enable fine-grained, part-level visual reasoning by addressing architectural flaws in existing LMMs and setting a new standard for grounded multimodal understanding.