2 papers across 2 sessions
We pioneer training world models through reinforcement learning with verifiable rewards (RLVR), demonstrating substantial performance gains on both language- and video-based world models.