Postdoc, CMU, Carnegie Mellon University
3 papers at NeurIPS 2025
Machine learning researchers must urgently work with policymakers to address growing risks from embodied AI by plugging gaps in existing frameworks.
Antidistillation sampling strategically modifyies a model's next-token probability distribution to poison reasoning traces, rendering them significantly less effective for distillation while preserving the model's practical utility.
We present a data-centric pretraining framework that builds safety into the model from the start