Research Scientist, MIT-IBM Watson AI Lab, IBM Research
2 papers at NeurIPS 2025
We present that off-the-shelf PRMs are often poorly calibrated. To this end, we introduce a quantile-regression calibration that aligns their outputs with success probabilities. We show calibration unlocks instance-adaptive inference-time scaling.