Tianyi Qiu

AI Safety Fellow, Anthropic

1 paper at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning

#2810 · Zhonghao He, Tianyi Qiu, Hirokazu Shirado, Maarten Sap

We introduce the Martingale Score, an unsupervised metric from Bayesian statistics, to show that reasoning in LLMs often leads to belief entrenchment rather than truth-seeking, and shows this score predicts ground-truth accuracy.