3 papers across 2 sessions
We propose an online reinforcement learning technique to fine-tune a family of flow matching policies for robot learning.
We propose the first framework of data attribution for online RL.