2 papers across 2 sessions
We propose a pipeline for detecting visual inconsistencies using visual correspondences based on disentangled diffusion model features
We show that supervised semantic correspondence methods fail to generalize well to unseen keypoints and we introduce geometric constraints during training to address this.