CARIAD SE - NeurIPS 2025

🏛 CARIAD SE

1 paper across 1 session

Poster Session 2

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering

#4617 · Liangyu Zhong, Fabio Philipp Rosenthal, Joachim Sicking, Fabian Hüger, Thorsten Bagdonat, Hanno Gottschalk, Leo Schwinn

We propose a training-free visual cropping method that leverages MLLM-internal representations for VQA tasks focusing on small details, achieving strong performance with significantly higher efficiency than prior methods.