2 papers across 2 sessions
Masked Diffusion Models can generate low-perplexity text efficiently, but can not handle tasks requiring high accuracy efficiently.