Samuel Marks

Researcher, Anthropic

1 paper at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 5

1 paper

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Erasing Conceptual Knowledge from Language Models

#1814 · Rohit Gandikota, Sheridan Feucht, Samuel Marks, David Bau

This paper introduces ELM, a method to erase concepts from language models using the model's own knowledge classification. It applies targeted updates to reduce concept generation while preserving overall abilities and resisting attacks.