logo
today local_bar
Poster Session 4 · Thursday, December 4, 2025 4:30 PM → 7:30 PM
#1603

Activation-Guided Consensus Merging for Large Language Models

NeurIPS Poster OpenReview

Abstract

Recent research has increasingly focused on reconciling the reasoning capabilities of System 2 with the efficiency of System 1. While existing training-based and prompt-based approaches face significant challenges in terms of efficiency and stability, model merging emerges as a promising strategy to integrate the diverse capabilities of different Large Language Models (LLMs) into a unified model.
However, conventional model merging methods often assume uniform importance across layers, overlooking the functional heterogeneity inherent in neural components. To address this limitation, we propose Activation-Guided Consensus Merging (ACM), a plug-and-play merging framework that determines layer-specific merging coefficients based on mutual information between activations of pre-trained and fine-tuned models. ACM effectively preserves task-specific capabilities without requiring gradient computations or additional training.
Extensive experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods. For instance, in the case of Qwen-7B models, TIES-Merging equipped with ACM achieves a 55.3% reduction in response length while simultaneously improving reasoning accuracy by 1.3 points. We submit the code with the paper for reproducibility, and it will be publicly available.
Poster