PhD student, The University of Tokyo
1 paper at NeurIPS 2025
We introduce ALE-bench, a new benchmark for evaluating AI systems on score-based algorithmic programming contests.