Full Professor, ETHZ - ETH Zurich
3 papers at NeurIPS 2025
We mix discrete and continuous adversarial attacks to adversarially train more robust LLMs. We evaluate our models in different realistic inference settings and show that they are more robust while matching the training cost of other SoTA models.
We introduce MathArena, a new benchmark for evaluating LLMs on recurring math competitions which provide a stream of high-quality uncontaminated problems.
We study token-level watermarking in the context of autoregressive image generation models.