1 paper across 1 session
Evaluating the spatial reasoning capabilities of large Vision-Language Models