1 paper across 1 session
We propose a novel self-improvement algorithm to teach language models to perform effective search.