2 papers across 1 session
We show that deep neural networks across architectures and training conditions all instantiate the same abstract algorithm for modular addition.