PhD student, University of Massachusetts, Amherst
1 paper at NeurIPS 2025
We introduce CARES, a 18K-prompt benchmark for evaluating medical safety of LLMs under adversarial conditions, with graded harms, jailbreaks, and a fine-grained response metric.