Jingren Zhou

Researcher, Alibaba Group

4 papers at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 3

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

WebDancer: Towards Autonomous Information Seeking Agency

#5515 · Jialong Wu, Baixuan Li, Runnan Fang, Wenbiao Yin, Liwen Zhang, Zhenglin Wang, Zhengwei Tao, Ding-Chu Zhang, Zekun Xi, Xiangru Tang, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou

Provable Scaling Laws for the Test-Time Compute of Large Language Models

#2814 · Yanxi Chen, Xuchen Pan, Yaliang Li, Bolin Ding, Jingren Zhou

Poster Session 4

2 papers

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

#5202 · Zihan Qiu, Zekun Wang, Bo Zheng, Zeyu Huang, Kaiyue Wen, Songlin Yang, Rui Men, Le Yu, Fei Huang, Suozhi Huang, Dayiheng Liu, Jingren Zhou, Junyang Lin

We find applying a query-dependent head-specific sigmoid gate after the Scaled Dot-Product Attention (SDPA) consistently improves performance, improves scaling properties and mitigates the `massive activation' and `attention sink'.

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

#113 Spotlight · Daoyuan Chen, Yilun Huang, Xuchen Pan, Jiang Nana, Haibin Wang, Yilei Zhang, Ce Ge, Yushuo Chen, Wenhao Zhang, Zhijian Ma, Jun Huang, Wei Lin, Yaliang Li, Bolin Ding, Jingren Zhou

A scalable system for foundation model data processing, offering 150+ multimodal OPs, cloud-native efficiency (TB-scale on 10k+ cores), and diverse interfaces (Python/APIs/chat), widely adopted in research and industry (e.g., Alibaba Cloud).