2 papers across 2 sessions
This paper introduces a method to identify the specific low-dimensional signals that are causal to the attention, enabling efficient, single-pass circuit discovery and revealing novel, model-wide control mechanisms.