1 paper across 1 session
We investigate an efficient and effective context-window scheduling method for language model pretraining.