4 papers across 3 sessions
We propose SimpleStrat for diversifying LLM generations and introduce CoverageQA a benchmark of multi-answer questions for evaluating coverage.
This paper proposes an approximation algorithm for streaming stochastic submodular maximization problem under a novel on-demand user requests senario
We show that limiting a model's confidence during training can improve test-time scaling in mathematical reasoning.
Develops split conformal methods that achieve approximately conditional coverage on new predictions