Full Professor, Swiss Federal Institute of Technology
2 papers at NeurIPS 2025
We show in theory and practice that by allowing non-linear transformations in causal abstraction, any neural network (even random ones) can be perfectly aligned to any algorithm, rendering this interpretability approach meaningless if unconstrained.
We propose a parametrisation of SSM transition matrices that enables SSMs to track states of arbitrary finite-state automata while keeping the cost of the parallel scan comparable to that of diagonal SSMs.