2 papers across 1 session
A unified and scalable RL framework applicable to online, offline, and offline-to-online settings.
We propose Q-chunking, a simple, effective offline-to-online RL method that uses action chunking to improve value propagation and exploration via temporally coherent actions.