MS student, University of Michigan - Ann Arbor
1 paper at NeurIPS 2025
We present MLRC-Bench, a dynamic benchmark designed to rigorously assess how well language agents address ML research challenges with objective, performance-based evaluations.