PhD student, Pennsylvania State University
1 paper at NeurIPS 2025
We propose novel model-free RL and FRL algorithms, which simultaneously achieves the best-known near-optimal regret, a low burn-in cost and a logarithmic policy switching cost or communication cost.