1 paper across 1 session
MxDs show that dense layers are more faithfully represented by mixtures of specialized sublayers than by sparsely activating neurons, while remaining just as interpretable.