PhD student, Pennsylvania State University
1 paper at NeurIPS 2025
an importance-sampling-based method to mitigate over-optimization in Direct Alignment Algorithms for language model alignment