Assistant Professor, National University of Singapore
2 papers at NeurIPS 2025
We introduce a novel embodied VLM agent with a VLM fine-tuned by agentic data synthesis for open-world mobile manipulation, unifying scene understanding, state tracking, and action generation for state-of-the-art results.
A Planning Representation and Paradigm Investigation of Vision-Language-Action Models