language - NeurIPS 2025

language

6 papers across 3 sessions

Poster Session 2

2 papers

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion

#3702 · Alan Amin, Nate Gruver, Andrew Wilson

We investigate the reasons masking diffusion dominates discrete diffusion.

Scaling and context steer LLMs along the same computational path as the human brain

#2006 Spotlight · Joséphine Raugel, Jérémy Rapin, Stéphane d'Ascoli, Valentin Wyart, Jean-Remi King

Emergence of an alignment between LLMs' and the brain's computational dynamics, and key factors allowing it : scaling and context size.

Poster Session 4

3 papers

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Reinforcement Learning Teachers of Test Time Scaling

#3519 · Edoardo Cetin, Tianyu Zhao, Yujin Tang

We introduce a new class of Reinforcement Learned Teachers trained to provide effective reasoning traces for downstream distillation, yielding more effective data for distillation and cold-starting than orders of magnitude larger reasoning LMs.

COS3D: Collaborative Open-Vocabulary 3D Segmentation

#4818 · Runsong Zhu, Ka-Hei Hui, Zhengzhe Liu, Qianyi Wu, Weiliang Tang, Shi Qiu, Pheng-Ann Heng, Chi-Wing Fu

We present COS3D, a new collaborative prompt-segmentation framework that effectively contributes to integrating complementary language and segmentation cues throughout the entire pipeline.

ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering

#702 · Yuki Imajuku, Kohki Horie, Yoichi Iwata, Kensho Aoki, Naohiro Takahashi, Takuya Akiba

We introduce ALE-bench, a new benchmark for evaluating AI systems on score-based algorithmic programming contests.

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Ground-Compose-Reinforce: Grounding Language in Agentic Behaviours using Limited Data

#416 · Andrew Li, Toryn Klassen, Andrew Wang, Parand A. Alamdari, Sheila McIlraith

We train RL agents directly from high-level specifications, without reward functions or domain-specific oracles.