logo
today local_bar
Poster Session 1 · Wednesday, December 3, 2025 11:00 AM → 2:00 PM
#2404

Quantifying Generalisation in Imitation Learning

NeurIPS Project Page Poster OpenReview

Abstract

Imitation learning benchmarks often lack sufficient variation between training and evaluation, limiting meaningful generalisation assessment. We introduce Labyrinth, a benchmarking environment designed to test generalisation with precise control over structure, start and goal positions, and task complexity.
It enables verifiably distinct training, evaluation, and test settings.
Labyrinth provides a discrete, fully observable state space and known optimal actions, supporting interpretability and fine-grained evaluation. Its flexible setup allows targeted testing of generalisation factors and includes variants like partial observability, key-and-door tasks, and ice-floor hazards.
By enabling controlled, reproducible experiments, Labyrinth advances the evaluation of generalisation in imitation learning and provides a valuable tool for developing more robust agents.
Poster