PhD student, University of Science and Technology of China
1 paper at NeurIPS 2025
We propose a framework for efficient MoE post-training on 3.5D Wafer-scale chiplets.