2 papers across 2 sessions
We mix discrete and continuous adversarial attacks to adversarially train more robust LLMs. We evaluate our models in different realistic inference settings and show that they are more robust while matching the training cost of other SoTA models.
A surprisingly simple plug-and-play method to strengthen adversarial image protection against diverse purification techniques.