Assistant Professor, Eberhard-Karls-Universität Tübingen
3 papers at NeurIPS 2025
We introduce a quantization-free way to train autoregressive transformers for continuous action decision making, improving on discretized action methods.
We study the problem of non-stationary Lipschitz bandits and achieve minimax optimal rate without knowledge of the non-stationarity.
A Max K-armed Bandit using assumptions derived from empirical data that handles short-tailed and skewed distributions to dynamically allocate resources to hyperparameter optimization runs.