1 paper across 1 session
We propose the first Best-of-Both-Worlds algorithm for multi-armed bandits with adversarial delays that matches lower bounds in both stochastic and adversarial settings, significantly improving previous results.