Professor, Tsinghua University
1 paper at NeurIPS 2025
This paper proposed a LLM-based plug-in, which is compatible with various RL algorithms, that enhances the efficiency of policy exploration in RL training.