Julian Minder

PhD student, EPFL - EPF Lausanne

2 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 1

1 paper

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?

#1014 Spotlight · Denis Sutter, Julian Minder, Thomas Hofmann, Tiago Pimentel

We show in theory and practice that by allowing non-linear transformations in causal abstraction, any neural network (even random ones) can be perfectly aligned to any algorithm, rendering this interpretability approach meaningless if unconstrained.

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning

#1014 · Julian Minder, Clément Dumas, Caden Juang, Bilal Chughtai, Neel Nanda

Using crosscoders (SAE variant) for chat-tuning concept identification, we diagnose spurious chat-only concepts arising from L1 loss artifacts and show BatchTopK robustly reveals genuine, interpretable ones.