PhD student, The Hong Kong University of Science and Technology
2 papers at NeurIPS 2025
We proposed a new data selection method for pretraining multilingual Large Language Models