1 paper across 1 session
We accelerates attention flow computation by pruning attention graphs, achieving high efficiency while preserving interpretability.