Researcher, Facebook
1 paper at NeurIPS 2025
Our new benchmark AbstentionBench reveals reasoning models struggle to determine when not to answer.