4 papers across 3 sessions
MLPs contain "channels to infinity" where pairs of neurons evolve to form gated linear units with diverging output weights, creating regions that appear like flat minima but actually have slowly decreasing loss value
Encouraging model in model-based reinforcement learning to converge to flatter minima in the loss landscape will result in better downstream policies