PhD student, University of Oxford
2 papers at NeurIPS 2025
We propose a way to create and access episodic memory when training transformer policies with RL on long horizon tasks that require remembering things from the past.