Researcher, Google Deepmind
2 papers at NeurIPS 2025
We design an RL environment for automated mathematical theory formation and present an evolutionary synthesis methodology for interestingness measures to guide exploration in it
We introduce CLEVER, a hand-curated benchmark for verified code generation in Lean. It requires full formal specs and proofs. No few-shot method solves all stages, making it a strong testbed for synthesis and formal reasoning.