5 papers across 3 sessions
We improve 3D scene graph prediction by focusing on better object representations and relationship modeling, showing that accurate object features are crucial for understanding how objects relate to each other in 3D environments.
We propose DANCE, a novel explainable video action recognition framework that provides clear and structured explanations by disentangling motion dynamics and spatial context concepts.
We prevent unauthorized personalization of diffusion models at the model level.
We reveal that shortcut features correspond to top eigenfunctions of NTK and dominate the neural network output after convergence.