2 papers across 1 session
We demonstrate that using reverse Kullback-Leibler loss with the log-derivative trick for training diffusion bridges outperforms the commonly used Log Variance loss, providing better results and more stable training.
Absorbing state discrete diffusion models that enable remasking benefit from inference-time scaling, improving sample quality, controlled generation, and performance on downstream tasks.