PhD student, Eindhoven University of Technology
1 paper at NeurIPS 2025
We proposed a new data selection method for pretraining multilingual Large Language Models