MS student, EPFL - EPF Lausanne
1 paper at NeurIPS 2025
We introduce a benchmark to measure safety of general computer use agents across diverse categories of harm