On Policy Learning

1 paper across 1 session

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

#314 · Shutong Ding, Ke Hu, Shan Zhong, Haoyang Luo, Weinan Zhang, Jingya Wang, Jun Wang, Ye Shi

We propose GenPO, which effectively incorporates invertible diffusion model into on-policy RL, and deals with the challenge of log-likehood computation in diffusion policies.