Researcher, UK AI Safety Institute
2 papers at NeurIPS 2025
Defences against LLM misuse fine-tuning attacks that aim to detect individual malicious or suspicious samples are insufficient.