Staff Algorithm Engineering, CLC, Meituan
2 papers at NeurIPS 2025
We propose a bilevel framework that jointly learns layer-wise sparsity and low-rank structure via truncated Gaussian sampling and efficient matrix approximation, achieving better performance for compressing LLMs.