Full Professor, University of Southern California
1 paper at NeurIPS 2025
We address distributional shift among diverse preferences with robust DPO: Wasserstein DPO (WDPO) and Kullback–Leibler DPO (KLDPO). Finite-sample guarantees, tractable gradient-based algorithms for hard DRO objectives, strong empirical robustness.