MS student, Tsinghua University
4 papers at NeurIPS 2025
This paper proposes Panacea, a post-fine-tuning method that mitigates harmful fine-tuning in large language models, maintaining safety alignment without sacrificing performance across different tasks and models.