Researcher, Huawei Technologies Ltd.
3 papers at NeurIPS 2025
This paper introduces 4D-VLA, a pretraining framework that enhances spatial and temporal reasoning in robotics by aligning coordinate systems with RGB-D sequences and efficient memory sampling.
We introduce SPAR-7M and SPAR-Bench, a large-scale 2D-supervised dataset and benchmark for spatial reasoning, built using 3D ground-truth.