PhD student, CMU, Carnegie Mellon University
2 papers at NeurIPS 2025
Antidistillation sampling strategically modifyies a model's next-token probability distribution to poison reasoning traces, rendering them significantly less effective for distillation while preserving the model's practical utility.
We present a data-centric pretraining framework that builds safety into the model from the start