1 paper across 1 session
We benchmark feel-good thompson sampling for contextual bandits with MCMC methods and show that they perform well in the linear setting but do not perform well in neural bandit tasks.