4 papers across 3 sessions
Convergence analysis and experiments of a new label model.
We introduce Weaver, a framework that combines multiple weak verifiers to effectively select responses in repeated sampling, achieving frontier model accuracy without supervised fine-tuning, while reducing verification costs by 99.97%.
We use in-context learning as weak supervision to train a student model that internalizes demonstration-induced latent shifts via adapter tuning, enabling efficient inference with improved generalization.