Researcher, Google
1 paper at NeurIPS 2025
We show that DiLoCo, a method for communication-efficient language model training, exhibits reliable scaling law behavior.