1 paper across 1 session
Achieving High Accuracy in Distributed Training Even with Aggressive Gradient Compression