1 paper across 1 session
We propose a human-in-the-loop learning method that achieves faithful imitation via distribution alignment and adapts to evolving behavior using dynamic regret minimization.