logo
today local_bar
Poster Session 5 · Friday, December 5, 2025 11:00 AM → 2:00 PM
#3001

Oracle-Efficient Combinatorial Semi-Bandits

NeurIPS OpenReview

Abstract

We study the combinatorial semi-bandit problem where an agent selects a subset of base arms and receives individual feedback. While this generalizes the classical multi-armed bandit and has broad applicability, its scalability is limited by the high cost of combinatorial optimization, requiring oracle queries at every round.
To tackle this, we propose oracle-efficient frameworks that significantly reduce oracle calls while maintaining tight regret guarantees. For worst-case linear rewards, our algorithms achieve regret using only oracle queries. We also propose covariance-adaptive algorithms that leverage noise structure for improved regret, and extend our approach to general (non-linear) rewards.
Overall, our methods reduce oracle usage from linear to (doubly) logarithmic in time, with strong theoretical guarantees.