Poster Session 6 · Friday, December 5, 2025 4:30 PM → 7:30 PM
#900
DSAS: A Universal Plug-and-Play Framework for Attention Optimization in Multi-Document Question Answering
Abstract
While large language models (LLMs) show considerable promise across various fields, they have notable limitations in handling multi-document question answering (Multi-doc QA) tasks. The first challenge is long-range dependency modeling, where LLMs struggle to focus on key information in long texts, which weakens important semantic connections. Second, most LLMs suffer from the "lost-in-the-middle" issue, where they have difficulty processing information in the middle of long inputs. Current solutions either truncate global dependencies or demand costly finetuning, ultimately lacking a universal and simple solution for these challenges.
To resolve these limitations, we propose Dual-Stage Adaptive Sharpening (DSAS) containing two modules.
- The Contextual Gate Weighting (CGW) module alleviates "lost-in-the-middle" by assessing paragraph relevance through layer-wise attention tracking and position-aware weighting.
- The Reciprocal Attention Suppression (RAS) module enhances focus on critical paragraphs by suppressing information exchange between key and irrelevant texts, thus mitigating the limitations in long-range dependency modeling.
Extensive experiments on four benchmarks demonstrate DSAS's efficacy across mainstream LLMs (Llama, Qwen, Mistral, and Deepseek), with an average F1-score improvement of 4.2% in Multi-doc QA tasks on Llama-3.1-8B-Instruct and Qwen2.5-14B-Instruct. Ablation studies confirm the essential contributions of both the CGW and RAS modules. In addition, detailed discussions in the Appendix further validate the robustness and scalability of DSAS.