1 paper across 1 session
Based on systematic empirical analysis on post-training compression, we propose a calibration data curation framework to help pruning and quantization methods better preserve critical LLM capabilities.