2 papers across 2 sessions
We propose PolyJuice, a black box red teaming method that steers text-to-image generative models to generate images that deceive a synthetic image detector.
Under input uncertainty, transformer models exhibit a systematic exploration of input‑agnostic conceptual representations, increasing the likelihood of hallucinations.