1 paper across 1 session
A computationally efficient algorithm for identifying the exact Pareto optimal set with fixed confidence and any preference cone in a vector-valued Bandit. FraPPE is provably asymptotically optimal and numerically achieves the least sample complexity