1 paper across 1 session
We propose a computationally tractable multinomial logit contextual bandit algorithm, which is designed to handle generic non-linear parametric utility functions.