A-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

Xiaoang Xu, Shuo Wang, Xu Han, Zhenghao Liu, Huijia Wu, Pei Pei Li, Zhiyuan Liu, Maosong Sun, Zhaofeng He

Beijing University of Posts and Telecommunications· Tsinghua University· Northeastern University· Beijing National Research Center for Information Science and Technology

long-to-short thought compression reasoning efficient CoT

⋅ NeurIPS ⋅ Project Page ⋅Slides ⋅Poster ⋅OpenReview

Abstract

Large Reasoning Models (LRMs) achieve superior performance by extending the thought length. However, a lengthy thinking trajectory leads to reduced efficiency. Most of the existing methods are stuck in the assumption of overthinking and attempt to reason efficiently by compressing the Chain-of-Thought, but this often leads to performance degradation.

To address this problem, we introduce A-Thought, an efficient tree search-based unified framework designed to identify and isolate the most essential thoughts from the extensive reasoning chains produced by these models. It formulates the reasoning process of LRMs as a search tree, where each node represents a reasoning span in the giant reasoning space.

By combining the A search algorithm with a cost function specific to the reasoning path, it can efficiently compress the chain of thought and determine a reasoning path with high information density and low cost. In addition, we also propose a bidirectional importance estimation mechanism, which further refines this search process and enhances its efficiency beyond uniform sampling.

Extensive experiments on several advanced math tasks show that A-Thought effectively balances performance and efficiency over a huge search space. Specifically, A-Thought can improve the performance of QwQ-32B by 2.39

\times

with low-budget and reduce the length of the output token by nearly 50\% with high-budget. The proposed method is also compatible with several other LRMs, demonstrating its generalization capability.