4 papers across 3 sessions
A versatile data mixture ratio optimization framework for LLM training that enjoy both theoretical and practical advantages.