Weight Averaging

1 paper across 1 session

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training

#907 · Minhak Song, Beomhan Baek, Kwangjun Ahn, Chulhee Yun

We show that Schedule-Free methods effectively navigate the river structure of the loss landscape, enabling scalable language model training without decay schedules or extra memory.