PhD student, Ecole Nationale de la Statistique et de l'Administration Economique
1 paper at NeurIPS 2025
We propose a policy gradient algorithm for fine-tuning discrete diffusion models over non-differentiable rewards.