Zihan Qiu

Researcher, Alibaba Group

2 papers at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 3

1 paper

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

A Controllable Examination for Long-Context Language Models

#1911 Spotlight · Yijun Yang, Zeyu Huang, Wenhao Zhu, Zihan Qiu, Fei Yuan, Jeff Z. Pan, Ivan Titov

We propose LongBioBench for controllable evaluation on Long-Context Language Models

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

#5202 · Zihan Qiu, Zekun Wang, Bo Zheng, Zeyu Huang, Kaiyue Wen, Songlin Yang, Rui Men, Le Yu, Fei Huang, Suozhi Huang, Dayiheng Liu, Jingren Zhou, Junyang Lin

We find applying a query-dependent head-specific sigmoid gate after the Scaled Dot-Product Attention (SDPA) consistently improves performance, improves scaling properties and mitigates the `massive activation' and `attention sink'.