Deokjae Lee

PhD student, Seoul National University

1 paper at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment

#3411 · Deokjae Lee, Hyun Oh Song

We develop Q-Palette, a quantizer suite with efficient inference CUDA kernels and wide fractional-bit support, enabling mixed-scheme quantization that achieves ~36% faster LLM decoding than NormalFloat while improving accuracy.