PhD student, Duke University
2 papers at NeurIPS 2025
We present a model-aware approach that leverages the model’s own signals to dynamically choose training data, markedly boosting both training and data efficiency in RL fine-tuning.