3 papers across 2 sessions
We propose the first Best-of-Both-Worlds algorithm for multi-armed bandits with adversarial delays that matches lower bounds in both stochastic and adversarial settings, significantly improving previous results.
This paper aims to broaden the theoretical foundation of FTPL and emphasize the need for further investigation to better understand the behavior of FTPL in broader settings.