Intern, Microsoft Research Asia
1 paper at NeurIPS 2025
GradSPO reinterprets Stepwise Preference Optimization (SPO) through a novel gradient guidance lens, enabling a simplified objective and integrated noise reduction to achieve superior human preference alignment in text-to-image models.