PhD student, University of Oxford
3 papers at NeurIPS 2025
We propose a principled taxonomy, evaluation procedure, and unified algorithm space for offline RL.
Unsupervised Environment Design to create automatic curricula over world model-generated environments, showing strong generalization on unseen procedural tasks while training exclusively within offline learned world models.
A complete reimplementation of MiniGrid environments with JAX unlocking 160,000x faster experimentation