1 paper across 1 session
We present the first finite-sample analysis for policy evaluation in robust average-reward reinforcement learning using semi-norm contractions.