PhD student, Northeastern University
2 papers at NeurIPS 2025
We introduce evaluations that reveal limitations in how diffusion models erase concepts.
This paper introduces ELM, a method to erase concepts from language models using the model's own knowledge classification. It applies targeted updates to reduce concept generation while preserving overall abilities and resisting attacks.