PhD student, Massachusetts Institute of Technology
1 paper at NeurIPS 2025
We provide a method for accurate end-to-end FP4 training of Large Language Models.