FP8 - NeurIPS 2025

today local_bar

FP8

3 papers across 3 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMs

#806 · Haeun Lee, Omin Kwon, Yeonhong Park, Jae W. Lee

Poster Session 2

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Towards Fully FP8 GEMM LLM Training at Scale

#4010 · Alejandro Hernández Cano, Dhia Garbaya, Imanol Schlag, Martin Jaggi

We perform an important step towards LLM pure FP8 training by enabling stable FP8 dot product attention reaching new throughput records

Poster Session 4

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic

#4103 · Kanghyun Choi, Hyeyoon Lee, Sunjong Park, Dain Kwon, Jinho Lee

FALQON accelerates LoRA fine-tuning by up to 3$\times$ through merging adapters into an FP8-quantized backbone, removing redundant quantization overhead from small matrices.