1 paper across 1 session
Exploration of rule-based reinforcement learning (RL) in MLLM post-training for perception policy learning.