PhD student, Department of Computer Science, ETHZ - ETH Zurich
1 paper at NeurIPS 2025
We introduce MathArena, a new benchmark for evaluating LLMs on recurring math competitions which provide a stream of high-quality uncontaminated problems.