Principal Researcher, ByteDance Inc.
3 papers at NeurIPS 2025
We propose a novel model ``MoE-ization'' strategy using SVD, which leads to a conflict- and oblivion-resistant multi-task adaptation method.
We proposed a new data selection method for pretraining multilingual Large Language Models