today local_bar

Simon Shaolei Du

Assistant Professor, University of Washington

6 papers at NeurIPS 2025

Homepage· OpenReview· Semantic Scholar· Google Scholar

Poster Session 1

Wednesday, December 3, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

Sharp Gap-Dependent Variance-Aware Regret Bounds for Tabular MDPs

#3314 · Shulun Chen, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon Shaolei Du

We develop a variance-aware gap-dependent regret bound with better $H$ dependence for tabular MDPs.

Deployment Efficient Reward-Free Exploration with Linear Function Approximation

#5305 · Zihan Zhang, Yuxin Chen, Jason D. Lee, Simon Shaolei Du, Lin Yang, Ruosong Wang

We provide a computational efficient algorithm to achieve $O(H)$ deployment cost with polynomial sample complexity.

Poster Session 5

Friday, December 5, 2025 · 11:00 AM → 2:00 PM

Exhibit Hall C,D,E

A Minimalist Example of Edge-of-Stability and Progressive Sharpening

#4004 · Liming Liu, Zixuan Zhang, Simon Shaolei Du, Tuo Zhao

A new minimalist example to understand the Edge of Stability and Progressive Sharpening phenomenon

Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval

#4702 · Siting Li, Xiang Gao, Simon Shaolei Du

We build a benchmark on attribute-focused text-to-image retrieval and propose a pipeline of using promptable image embeddings for solving it, leading to performance gain.

Poster Session 6

Friday, December 5, 2025 · 4:30 PM → 7:30 PM

Exhibit Hall C,D,E

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

#415 · Yiping Wang, Qing Yang, Zhiyuan Zeng, Liliang Ren, Liyuan Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang, Simon Shaolei Du, yelong shen

We only need one example for RLVR on LLMs to achieve significant improvement on math tasks

Understanding the Gain from Data Filtering in Multimodal Contrastive Learning

#3817 · Divyansh Pareek, Sewoong Oh, Simon Shaolei Du

We theoretically analyze the benefit of filtering a noisy training dataset on model performance in multimodal contrastive learning, and identify two regimes with different amounts of gain.