1 paper across 1 session
We introduce Impromptu VLA, a new 80k-clip dataset of unstructured "corner case" driving scenarios with rich QA annotations, which significantly boosts the safety and planning performance of Vision-Language-Action models.