PhD student, University of Texas at Austin
1 paper at NeurIPS 2025
In the paper, we study the convergence rates of the maximum likelihood estimator of gating and prompt parameters of the softmax-contaminated MoE.