2 papers across 1 session
We propose a novel inference-time personalized alignment method that elicits the user's preferences with a few preference queries.
We build a Monte Carlo Tree over the diffusion denoising process that can be used for scalable, compute-efficient, inference‑time alignment of pretrained diffusion models to new reward functions