4 papers across 3 sessions
We provide sharp convergence analysis on sketched adaptive distributed deep learning that only depends on the intrinsic dimension of loss Hessian, instead of the full dimensionality..
Distributed mediation analysis