PhD student, Delft University of Technology
1 paper at NeurIPS 2025
We prove an ensemble of policies distilled on a diverse dataset improves generalisation in reinforcement learning.