Researcher, NVIDIA
3 papers at NeurIPS 2025
We provide a systematic exploration and roadmap for latency-optimal small language models through optimized architectural and training strategies.
Long-RL enables RL on hour-long videos on a single A100; LongVILA-R1-7B supports 8,192 frames and scores 65.1%/71.1% on VideoMME.