MS student, ShanghaiTech University
1 paper at NeurIPS 2025
We propose GenPO, which effectively incorporates invertible diffusion model into on-policy RL, and deals with the challenge of log-likehood computation in diffusion policies.