1 paper across 1 session
We introduce a method that allows adversarial optimization to be used in general-sum settings to train more robust and diverse policies.