logo
today local_bar
Poster Session 5 · Friday, December 5, 2025 11:00 AM → 2:00 PM
#3203

Breaking AR’s Sampling Bottleneck: Provable Acceleration via Diffusion Language Models

NeurIPS OpenReview

Abstract

Diffusion models have emerged as a powerful paradigm for modern generative modeling, demonstrating strong potential for large language models (LLMs). Unlike conventional autoregressive (AR) models that generate tokens sequentially, diffusion models allow for parallel sampling, offering a promising path to accelerate generation and eliminate the left-to-right generation constraints. Despite their empirical success, theoretical understandings of diffusion language models remain underdeveloped.
In this work, we develop convergence guarantees for diffusion language models from an information-theoretic perspective. Our analysis demonstrates that the sampling error, measured by the Kullback-Leibler (KL) divergence, decays inversely with the number of iterations and scales linearly with the mutual information between tokens in the target text sequence. Crucially, our theory covers the regime