1 paper across 1 session
we established optimal risk bound of $1/(\gamma^2 n)$ for GD with deep ReLU networks