Assistant Professor, Computer Science Department, Stanford University
3 papers at NeurIPS 2025
SPRINT trains large reasoning models to dynamically spot and run independent subtasks in parallel, reducing sequential tokens by up to 40% on complex tasks like math without sacrificing accuracy compared to sequential reasoning baselines.
We introduce Weaver, a framework that combines multiple weak verifiers to effectively select responses in repeated sampling, achieving frontier model accuracy without supervised fine-tuning, while reducing verification costs by 99.97%.
We propose grafting, a simple approach to materialize new architectures by editing pretrained diffusion transformers. It enables architectural exploration under small compute budgets.