2 papers across 2 sessions
We uncover the emergent open-vocabulary semantic segmentation capability of diffusion transformers and show that amplifying this property enhances both segmentation and image generation.
A lightweight, plug-and-play mapper to boost the performance of OVSS with minimal computational overhead