PhD student, Institute of Science and Technology Austria
2 papers at NeurIPS 2025
We provide a method for accurate end-to-end FP4 training of Large Language Models.
We investigate new scaling laws which predict the scaling of LLMs when training them over quantized or sparse representations.