Associate Professor, EPFL
4 papers at NeurIPS 2025
This paper introduces GRAPE, a novel multi-source-multi-target domain reweighting framework designed to calibrate pretraining data mixtures for robust performance across multiple target tasks simultaneously.
We perform an important step towards LLM pure FP8 training by enabling stable FP8 dot product attention reaching new throughput records