Full Professor, School of Computer Science, Tel Aviv University
5 papers at NeurIPS 2025
We provided individual regret bounds for cooperative stochastic multi-armed bandits over communication graphs, independent of graph diameter, and also analyzed trade-offs with message size and communication rounds.
We present regret bounds for adversarial contextual bandits with general function approximation under delayed bandit feedback.
We propose the first Best-of-Both-Worlds algorithm for multi-armed bandits with adversarial delays that matches lower bounds in both stochastic and adversarial settings, significantly improving previous results.
We study learning set functions under one-sided feedback in the PAC learning framework.