2 papers across 2 sessions
Variational Learning Finds Flatter Solutions at the Edge of Stability
We introduce a novel method leveraging noise injection as a tool to elicit the latent capabilities of sandbagging LLMs.