Thorsten Bagdonat

Researcher, Group Innovation

1 paper at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering

#4617 · Liangyu Zhong, Fabio Rosenthal, Joachim Sicking, Fabian Hüger, Thorsten Bagdonat, Hanno Gottschalk, Leo Schwinn

We propose a training-free visual cropping method that leverages MLLM-internal representations for VQA tasks focusing on small details, achieving strong performance with significantly higher efficiency than prior methods.