Postdoc, ETHZ - ETH Zurich
2 papers at NeurIPS 2025
We perform an important step towards LLM pure FP8 training by enabling stable FP8 dot product attention reaching new throughput records