Poster Session 3 · Thursday, December 4, 2025 11:00 AM → 2:00 PM
#4410 Spotlight
RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation
Abstract
While latent diffusion models (LDMs), such as Stable Diffusion, are designed for high-resolution image generation, they often struggle with significant structural distortions when generating images at resolutions higher than their training one. Instead of relying on extensive retraining, a more resource-efficient approach is to reprogram the pretrained model for high-resolution (HR) image generation; however, existing methods often result in poor image quality and long inference time.
We introduce RepLDM, a novel reprogramming framework for pretrained LDMs that enables high-quality, high-efficiency, high-resolution image generation; see Fig. 1. RepLDM consists of two stages:
- an attention guidance stage, which generates a latent representation of a higher-quality training-resolution image using a novel parameter-free self-attention mechanism to enhance the structural consistency;
- a progressive upsampling stage, which progressively performs upsampling in pixel space to mitigate the severe artifacts caused by latent space upsampling.
Extensive experimental results demonstrate that RepLDM significantly outperforms state-of-the-art methods in both quality and efficiency for HR image generation, underscoring its advantages for real-world applications. Codes: https://github.com/kmittle/RepLDM.