1 paper across 1 session
Masked Diffusion Models can generate low-perplexity text efficiently, but can not handle tasks requiring high accuracy efficiently.