Research Fellow, Nanyang Technological University
4 papers at NeurIPS 2025
We detect and remove backdoor samples in MLLM fine-tuning by identifying abnormal attention entropy patterns without requiring clean data or model modifications.
We propose a novel safety-driven unlearning framework for diffusion models that can better maintain unlearning performance after downstream fine-tuning.
FOA-Attack enhances targeted adversarial transferability to closed-source MLLMs by optimally aligning global and local features.