Research Scientist, ByteDance Inc.
3 papers at NeurIPS 2025
We propose a 3D VLA model that aligns the input and output within a shared 2D space in both pre-training and fine-tuning which enables high data efficiency and achieves impressive performance in both basic and generalization settings.