today local_bar

Xiaozhi Wang

Assistant Professor, Tsinghua University

2 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons

#1010 · Jianhui Chen, Xiaozhi Wang, Zijun Yao, Yushi Bai, Lei Hou, Juanzi Li

In this paper, we interpret the mechanism behind safety alignment via neurons and analyze their properties.

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios

#114 Spotlight · Yunjia Qi, Hao Peng, Xiaozhi Wang, Amy Xin, Youfeng Liu, Bin Xu, Lei Hou, Juanzi Li

We propose a benchmark to evaluate the large language models' instruction following ability in agentic scenarios.