Assistant Professor, University of California, Los Angeles
1 paper at NeurIPS 2025
We characterize sample complexities for average-reward offline RL with function approximation for weakly communicating MDPs.