text-to-vision

1 paper across 1 session

Poster Session 4

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Rare Text Semantics Were Always There in Your Diffusion Transformer

#4810 · seil kang, Woojung Han, Dayun Ju, Seong Jae Hwang

Variance scale-up of text-token embeddings in MM-DiTs before joint attention lets rare prompts emerge without retraining, extra data, or denoising tweaks, boosting text-to-image, video, and editing