6 papers across 3 sessions
A framework that uses Monte Carlo Tree Search for guiding an LLM to ask information-seeking questions, learning from past successful strategies to solve problems more efficiently and accurately.
We use LLMs to create state-of-the-art AI planners.
We present an algorithm for test-time scaling of SDE-based diffusion models by searching for noise trajectories which optimize arbitrary rewards, empirically matching/exceeding MCTS performance.
a self-supervised method that improves open-weight value models using state-transition dynamics, enabling reward-free, efficient search with performance comparable to search with costly large models and tree-based methods
We propose DISC, a dynamic decomposition method that adaptively adjusts step sizes during LLM inference to allocate compute more efficiently, significantly improving performance and sample efficiency across reasoning and code generation benchmarks.