Full Professor, University of Michigan - Ann Arbor
3 papers at NeurIPS 2025
Instance-level adaptive KL penalty control method for Direct Preference Optimization
We present MLRC-Bench, a dynamic benchmark designed to rigorously assess how well language agents address ML research challenges with objective, performance-based evaluations.