PhD student, Technische Universität München
1 paper at NeurIPS 2025
We compare leading open SFT datasets, add quality annotations using MagPie, and design curation recipes leading to a high-performing leaner SFT mixture