2 papers across 2 sessions
We study the thinking process in visual reinforcement finetuning
We propose manifold steering that projects the steering direction of model overthinking on the low-dimensional activation manifold, effectively reducing output tokens while maintaining accuracy.