2 papers across 2 sessions
We propose a post-training technique, that uses a hypernetwork to efficiently steer diffusion model initial noise towards distributions favored by reward models, enhancing generation quality at low computational cost.