1 paper across 1 session
We propose a first framework that can compute a 4D spatio-temporal grid of video frames and 3D Gaussian point clouds for each time step in a feed-forward architecture.