Pengjie Gu

PhD student, Nanyang Technological University

2 papers at NeurIPS 2025

OpenReview· Semantic Scholar· Google Scholar

Poster Session 4

1 paper

Thursday, December 4, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Improving Reward Models with Proximal Policy Exploration for Preference-Based Reinforcement Learning

#409 · Yiwen Zhu, Jinyi Liu, Pengjie Gu, Yifu Yuan, Zhenxing Ge, Wenya Wei, Zhou Fang, Yujing Hu, Bo An

To enhance the reliability of the reward model for current policy improvement, we have developed the Proximal Policy Exploration (PPE) algorithm to increase the coverage of the preference buffer in areas close to the near-policy distribution.

Poster Session 6

1 paper

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

MTRec: Learning to Align with User Preferences via Mental Reward Models

#410 · Mengchen Zhao, Yifan Gao, Yaqing Hou, Xiangyang Li, Pengjie Gu, Zhenhua Dong, Ruiming Tang, Yi Cai

We propose MTRec, a novel sequential recommendation framework which uses a learned mental reward model to guide the recommendation model to align with users' real preferences.