MS student, Yonsei University
1 paper at NeurIPS 2025
We propose SAFEPATH, a lightweight method that aligns Large Reasoning Models to detect and suppress harmful chain-of-thought reasoning by injecting a brief safety signal at the start of reasoning.