Full Professor, Tsinghua University
4 papers at NeurIPS 2025
This work proposes a flexible neighborhood constraint for offline RL that mitigates the over-conservatism of density and sample constraints, while approximating the least restrictive support constraint without behavior policy modeling.