Concept Unlearning

1 paper across 1 session

Poster Session 5

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Erasing Conceptual Knowledge from Language Models

#1814 · Rohit Gandikota, Sheridan Feucht, Samuel Marks, David Bau

This paper introduces ELM, a method to erase concepts from language models using the model's own knowledge classification. It applies targeted updates to reduce concept generation while preserving overall abilities and resisting attacks.