Researcher, Google
2 papers at NeurIPS 2025
We provide a robust method of directly optimizing the pass at k with reinforcement learning, with theory and real world experiments.
Learning proof system dynamics, pruning proof search based on diversity and expected outcome