1 paper across 1 session
We present a reward‑engineering‑free, online Self‑Improvement procedure that enables robotic foundation models to sample-efficiently improve their policies, and autonomously practice and acquire skills generalizing far beyond their imitation data.