Assistant Professor, Massachusetts Institute of Technology
2 papers at NeurIPS 2025
AIM introduces a new scheme to improve merging performance in LLMs by combining continual learning principles and activation space-based model compression.
We present that off-the-shelf PRMs are often poorly calibrated. To this end, we introduce a quantile-regression calibration that aligns their outputs with success probabilities. We show calibration unlocks instance-adaptive inference-time scaling.