PhD student, Northwestern University, Northwestern University
2 papers at NeurIPS 2025
We introduce CARES, a 18K-prompt benchmark for evaluating medical safety of LLMs under adversarial conditions, with graded harms, jailbreaks, and a fine-grained response metric.