2 papers across 2 sessions
We construct a reasoning-oriented geo-localization dataset from social media images and apply GRPO-based reinforcement learning to fine-tune large vision-language models, enhancing their location reasoning capabilities.