1 paper across 1 session
We stabilize gradients for training increasingly deep reinforcement learning agents by using a second-order optimizer and residual connections