6 papers across 3 sessions
We investigate the reasons masking diffusion dominates discrete diffusion.
Emergence of an alignment between LLMs' and the brain's computational dynamics, and key factors allowing it : scaling and context size.
We introduce a new class of Reinforcement Learned Teachers trained to provide effective reasoning traces for downstream distillation, yielding more effective data for distillation and cold-starting than orders of magnitude larger reasoning LMs.
We present COS3D, a new collaborative prompt-segmentation framework that effectively contributes to integrating complementary language and segmentation cues throughout the entire pipeline.
We introduce ALE-bench, a new benchmark for evaluating AI systems on score-based algorithmic programming contests.
We train RL agents directly from high-level specifications, without reward functions or domain-specific oracles.