2 papers across 2 sessions
GradMetaNet is a neural architecture that efficiently processes gradients of other networks by exploiting their symmetries and rank-1 decomposition structure, enabling better learned optimizers, model editing, and loss curvature estimation
In this position paper, we advocate for systematically studying entire model populations, and argue that this requires charting them in a unified structure, the "Model Atlas".