logo
today local_bar
Poster Session 6 · Friday, December 5, 2025 4:30 PM → 7:30 PM
#3110

Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards

NeurIPS OpenReview

Abstract

We study stochastic linear bandits with heavy-tailed rewards, where the rewards have a finite -absolute central moment bounded by for some . We improve both upper and lower bounds on the minimax regret compared to prior work. When , the best prior known regret upper bound is . While a lower with the same scaling has been given, it relies on a construction using , and adapting the construction to the bounded-moment regime with yields only a lower bound. This matches the known rate for multi-armed bandits and is generally loose for linear bandits, in particular being below the optimal rate in the finite-variance case ().
We propose a new elimination-based algorithm guided by experimental design, which achieves regret , thus improving the dependence on for all and recovering a known optimal result for . We also establish a lower bound of , which strictly improves upon the multi-armed bandit rate and highlights the hardness of heavy-tailed linear bandit problems.
For finite action sets of size , we derive upper and lower bounds of and , respectively.
Finally, we provide action-set-dependent regret upper bounds and show that for some geometries, such as -norm balls for , we can further reduce the dependence on .