Reward Finetuning

2 papers across 2 sessions

Poster Session 4

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards

#3602 · Qingming LIU, Zhen Liu, Dinghuai Zhang, Kui Jia

We extend a new RL-based gradient-informed finetuning method to the task of reward finetuning/alignment for 3D native diffusion models.

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Value Gradient Guidance for Flow Matching Alignment

#4906 · Zhen Liu, Tim Xiao, Carles Domingo i Enrich, Weiyang Liu, Dinghuai Zhang

We propose a value gradient matching formulation for reward finetuning/alignment for flow matching models with the theory of optimal control, and empirically verify our method on the popular text-to-image flow matching model StableDiffusion3