PhD student, Tel Aviv University
2 papers at NeurIPS 2025
We design a PAC-learner for contextual combinatorial semi-bandits with sparse rewards, with a sample complexity bound that primarily scales with the sparsity parameter rather than the number of arms.
We present regret bounds for adversarial contextual bandits with general function approximation under delayed bandit feedback.