Somayeh Sojoudi

Associate Professor, University of California Berkeley

3 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

#1015 · Ziheng Cheng, Yixiao Huang, Hui Xu, Somayeh Sojoudi, Xuandong Zhao, Dawn Song, Song Mei

Poster Session 5

2 papers

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

#1516 · Yixiao Huang, Hanlin Zhu, Tianyu Guo, Jiantao Jiao, Somayeh Sojoudi, Michael I. Jordan, Stuart Russell, Song Mei

Revising and Falsifying Sparse Autoencoder Feature Explanations

#2600 · George Ma, Samuel Pfrommer, Somayeh Sojoudi

We developed new methods to refine and falsify sparse autoencoder feature explanations, yielding higher-quality interpretability of large language models.