1 paper across 1 session
Introduce the first Open-World Hand-Object Interaction (HOI) Synthesis framework that can generate Long-horizon HOI sequences of Unseen Objects from Open-vocabulary instruction with 3D Multimodal Large Language Model.