Listwise Preference Diffusion Optimization for User Behavior Trajectories Prediction

Hongtao Huang, Chengkai Huang, Junda Wu, Tong Yu, Julian McAuley, Lina Yao

University of New South Wales· Macquarie University· University of California San Diego· Adobe Research· CSIRO Data61

User Behavior Prediction Diffusion Models

Abstract

Forecasting multi-step user behavior trajectories requires reasoning over structured preferences across future actions, a challenge overlooked by traditional sequential recommendation. This problem is critical for applications such as personalized commerce and adaptive content delivery, where anticipating a user’s complete action sequence enhances both satisfaction and business outcomes. We identify an essential limitation of existing paradigms: their inability to capture global, listwise dependencies among sequence items.

To address this, we formulate User Behavior Trajectory Prediction (UBTP) as a new task setting that explicitly models longterm user preferences. We introduce Listwise Preference Diffusion Optimization (LPDO), a diffusion-based training framework that directly optimizes structured preferences over entire item sequences. LPDO incorporates a Plackett–Luce supervision signal and derives a tight variational lower bound aligned with listwise ranking likelihoods, enabling coherent preference generation across denoising steps and overcoming the independent-token assumption of prior diffusion methods.

To rigorously evaluate multi-step prediction quality, we propose the task-specific metric: Sequential Match (SeqMatch), which measures exact trajectory agreement, and adopt Perplexity (PPL), which assesses probabilistic fidelity.

Extensive experiments on real-world user behavior benchmarks demonstrate that LPDO consistently outperforms state-of-the-art baselines, establishing a new benchmark for structured preference learning with diffusion models.