theorem proving

5 papers across 3 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models

#4319 · Chenrui Cao, Liangcheng Song, Zenan Li, Xinyi Le, Xian Zhang, HUI XUE, Fan Yang

By carefully coordinating off-the-shelf models with inference only, we show the DSP framework can achieve surprisingly good results in theorem proving, comparable to the frontier models with RL-based large-scale training.

Poster Session 2

2 papers

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

#5401 · Zhenwen Liang, Linfeng Song, Yang Li, TAO YANG, Haitao Mi, Dong Yu

Solving Inequality Proofs with Large Language Models

#203 Spotlight · Jiayi Sheng, Luna Lyu, Jikai Jin, Tanglin Xia, Alex Gu, James Zou, Pan Lu

We introduce IneqMath, an informal inequality proving benchmark, and an LLM-as-judge suite, revealing that top LLMs achieve <10% overall accuracy due to flawed step-wise reasoning.

Poster Session 5

2 papers

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes

#2612 · Sean Lamont, Christian Walder, Amir Dezfouli, Paul Montague, Michael Norrish

Learning proof system dynamics, pruning proof search based on diversity and expected outcome

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities

#1808 · Haoyu Zhao, Yihan Geng, Shange Tang, Yong Lin, Bohan Lyu, Hongzhou Lin, Chi Jin, Sanjeev Arora

We introduce Ineq-Comp, a benchmark for testing compositional reasoning in formal inequality proving. Simple human-intuitive transformations cause major accuracy drops, showing that current LLM provers lack robust compositional generalization.