Ivan Evtimov

Research Scientist, Facebook

3 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks

#1311 · Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori, Chuan Guo, Kamalika Chaudhuri

A benchmark with realistic security scenarios for web agents based on LLMs

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

AdvPrefix: An Objective for Nuanced LLM Jailbreaks

#5305 · Sicheng Zhu, Brandon Amos, Yuandong Tian, Chuan Guo, Ivan Evtimov

A new LLM jailbreak objective that enables more nuanced control over jailbroken responses, exploits undergeneralization of safety alignment, and improves success rates of existing jailbreaks from 14% to 80%.

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents

#1304 · Arman Zharmagambetov, Chuan Guo, Ivan Evtimov, Maya Pavlova, Ruslan Salakhutdinov, Kamalika Chaudhuri

introducing a novel privacy benchmark for AI agents that evaluates their adherence to the data minimization principle on full-stack end-to-end environment.