PhD student, University of Texas at Austin
1 paper at NeurIPS 2025
We present a model-aware approach that leverages the model’s own signals to dynamically choose training data, markedly boosting both training and data efficiency in RL fine-tuning.