Researcher, Allen Institute for Artificial Intelligence
2 papers at NeurIPS 2025
Measuring and improving the signal-to-noise ratio in language model benchmarks.