Researcher, Microsoft
1 paper at NeurIPS 2025
We argue that conclusions drawn about relative system safety or attack method efficacy via AI red teaming are often not supported by evidence provided by attack success rate (ASR) comparisons.