PhD student, Massachusetts Institute of Technology
1 paper at NeurIPS 2025
We present that off-the-shelf PRMs are often poorly calibrated. To this end, we introduce a quantile-regression calibration that aligns their outputs with success probabilities. We show calibration unlocks instance-adaptive inference-time scaling.