Researcher, Alibaba Group
1 paper at NeurIPS 2025
We propose PlanU, a method that enhances LLM-based decision-making under uncertainty by modeling value distributions via quantile regression and guiding MCTS exploration using a novel Upper Confidence Bounds with Curiosity (UCC) score.