2 papers across 1 session
Minimax rate of robust estimation when different samples have different but known rates of corruption.
In the paper, we study the convergence rates of the maximum likelihood estimator of gating and prompt parameters of the softmax-contaminated MoE.