?
today
local_bar
search
Low rank decomposition
1 paper across 1 session
Poster Session 4
1 paper
Thursday, December 4, 2025 · 4:30 PM → 7:30 PM
Exhibit Hall C,D,E
Efficient Low Rank Attention for Long-Context Inference in Large Language Models
star
#3512
·
Li Tenghui, Guoxu Zhou, Xuyang Zhao, Yuning Qiu, Qibin Zhao
Use lighweight low rank (q K) to help indexing offloaded KVCached