PhD student, Beijing University of Aeronautics and Astronautics
1 paper at NeurIPS 2025
We propose a test-time detoxification framework that models toxicity transitions within the latent representation space to enable stable and precise representation editing guidance.