1 paper across 1 session
Training- and GPU-free Spatial Prompting for Multimodal Large Language Models