PhD student, Northeastern University
2 papers at NeurIPS 2025
We propose a novel feature attribution method that disentangles attributions based on a feature's value and its position within a sequence.
We propose, analyze, and validate a method for guiding LLM behavior at inference time by applying steering vectors to query and value representations.