Assistant Professor, Hong Kong University of Science and Technology
2 papers at NeurIPS 2025
We present 3EED, the first large-scale benchmark for 3D visual grounding across vehicles, drones, and quadrupeds, with over 134K 3D objects and 25K human-verified expressions in diverse outdoor scenes.
Proposing a zero-shot cross-task manipulation generalization benchmark and a novel generalizable VLA method.