PhD student, University of Alberta
1 paper at NeurIPS 2025
An efficient PbRL method that mitigates overfitting and overestimation via dual regularization, enhancing feedback efficiency in both online and offline settings