1 paper across 1 session
We show that vision-language models can piece together scattered training image patches—a capability we call visual stitching—which enables generalization to unseen images but also introduces new safety risks.