2 papers across 2 sessions
Estimate 3D human poses from multi-view radar data using 2D image-plane keypoints and 3D BBox labels, rather than more expensive 3D keypoint labels.
We propose a 3D full-body pose and cooking videos dataset along with multimodal behavior understanding benchmarks.