Efficient Language Models - NeurIPS 2025

today local_bar

Efficient Language Models

1 paper across 1 session

Poster Session 3

Thursday, December 4, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

#5308 · Yonggan Fu, Xin Dong, Shizhe Diao, Matthijs Van keirsbilck, Hanrong Ye, Wonmin Byeon, Yashaswi Karnati, Lucas Liebenwein, Maksim Khadkevich, Alexander Keller, Jan Kautz, Yingyan (Celine) Lin, Pavlo Molchanov

We provide a systematic exploration and roadmap for latency-optimal small language models through optimized architectural and training strategies.