science of language models

1 paper across 1 session

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers

Early phase training of Transformers on algorithmic tasks shows a plateau in loss, repetition bias and representation collapse before sudden drop in loss.