Researcher, DeepMind
2 papers at NeurIPS 2025
We show that DiLoCo, a method for communication-efficient language model training, exhibits reliable scaling law behavior.