Full Professor, Queen Mary, University of London
2 papers at NeurIPS 2025
MxDs show that dense layers are more faithfully represented by mixtures of specialized sublayers than by sparsely activating neurons, while remaining just as interpretable.
We ensure backward compatibility through multiple transformations and a relaxed orthogonality constraint for distribution-specific adaptation.