5 papers across 3 sessions
Using teacher value function and PBRS, propose a theoretically grounded method for preference distillation
In this paper, we investigate the universal approximation property of deep, narrow multilayer perceptrons (MLPs) for $C^1$ functions under the Sobolev norm, specifically the $W^{1, \infty}$ norm.