ZIJIA CHEN

Researcher, NVIDIA

2 papers at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 2

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

#3518 · Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan, Marcin Chochowski, Yashaswi Karnati, Raviraj Bhuminand Joshi, Ameya Sunil Mahabaleshwarkar, ZIJIA CHEN, Yoshi Suhara, Oluwatobi Olabiyi, Daniel Korzekwa, Mostofa Patwary, Mohammad Shoeybi, Jan Kautz, Bryan Catanzaro, Ashwath Aithal, Nima Tajbakhsh, Pavlo Molchanov

The paper introduces a pruning and distillation method for hybrid LLMs, compressing Nemotron-H 8B to 4B with better accuracy and ~2× faster inference, advancing the efficiency-accuracy trade-off.

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

#111 Spotlight · Shizhe Diao, Yu Yang, Yonggan Fu, Xin Dong, Dan SU, Markus Kliegl, ZIJIA CHEN, Peter Belcak, Yoshi Suhara, Hongxu Yin, Mostofa Patwary, Yingyan Celine Lin, Jan Kautz, Pavlo Molchanov

Nemotron-CLIMB automates data mixture optimization for pre-training, improving domain adaptation and outperforming Llama-3.2-1B by 2.0% on general reasoning.