Di Wu

Undergrad student, University of Science and Technology of China

1 paper at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Greedy Sampling Is Provably Efficient For RLHF

#3313 · Di Wu, Chengshuai Shi, Jing Yang, Cong Shen

This work shows that greedy sampling based on empirical estimates is provably efficient for RLHF, under both the general preference model and the Bradley-Terry model.