1 paper across 1 session
We propose a 3D full-body pose and cooking videos dataset along with multimodal behavior understanding benchmarks.