Agent Safety - NeurIPS 2025

today local_bar

Agent Safety

3 papers across 2 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents

#1201 · Jingyi Yang, Shuai Shao, Dongrui Liu, Jing Shao

TAI3: Testing Agent Integrity in Interpreting User Intent

#5404 · Shiwei Feng, Xiangzhe Xu, Xuan Chen, Kaiyuan Zhang, Syed Ahmed, Zian Su, Mingwei Zheng, Xiangyu Zhang

This paper presents TAI3, a stress testing framework that uses targeted input mutations to expose LLM agent errors that deviate from user intent

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents

#1106 · Hanjun Luo, Shenyu Dai, Chiming Ni, Xinfeng Li, Guibin Zhang, Kun Wang, Tongliang Liu, Hanan Salam