Low-bit quantization

2 papers across 2 sessions

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

LittleBit: Ultra Low-Bit Quantization via Latent Factorization

#1001 · Banseok Lee, Dongkyu Kim, Youngcheon You, Youngmin Kim

This paper presents LittleBit, a novel framework that combines latent matrix factorization and a multi-scale compensation mechanism to compress Large Language Models (LLMs) to ultra-low bit levels.

Poster Session 2

1 paper

Wednesday, December 3, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models

#3508 · Tianchen Zhao, Ke Hong, Xinhao Yang, Xuefeng Xiao, Huixia Li, Feng Ling, Ruiqi Xie, SiQi Chen, Hongyu Zhu, Zhang Yichong, Yu Wang

We unify the diverse attention pattern for visual generative models, and benefit the sparse and quantization