Load Balancing

1 paper across 1 session

Poster Session 4

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Advancing Expert Specialization for Better MoE

#5416 · Hongcan Guo, Haolang Lu, Guoshun Nan, Bolun Chu, Jialin Zhuang, Yuan Yang, Wenhao Che, Xinye Cao, Sicong Leng, Qimei Cui, Xudong Jiang

Our proposed orthogonality and variance losses improve performance in downstream fine-tuning of Mixture-of-Experts models by enhancing expert specificity, addressing expert homogenization caused by load balancing, while maintaining load balance.