1 paper across 1 session
We introduce PhyBlock, a progressive benchmark evaluating large vision-language models on physical understanding and spatial planning via robotic 3D block assembly tasks.