2 papers across 2 sessions
A generative model imputes missing outcomes to drive Thompson Sampling decisions, yielding a flexible algorithm with regret tied to offline prediction quality.