Red Teaming

2 papers across 2 sessions

Poster Session 3

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

PolyJuice Makes It Real: Black-Box, Universal Red Teaming for Synthetic Image Detectors

#4519 · Sepehr Dehdashtian, Mashrur Mahmud Morshed, Jacob Seidman, Gaurav Bharaj, Vishnu Boddeti

We propose PolyJuice, a black box red teaming method that steers text-to-image generative models to generate images that deceive a synthetic image detector.

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming

#1110 · Alex Chouldechova, A. Feder Cooper, Solon Barocas, Abhinav Palia, Dan Vann, Hanna Wallach

We argue that conclusions drawn about relative system safety or attack method efficacy via AI red teaming are often not supported by evidence provided by attack success rate (ASR) comparisons.