2 papers across 2 sessions
We design a PAC-learner for contextual combinatorial semi-bandits with sparse rewards, with a sample complexity bound that primarily scales with the sparsity parameter rather than the number of arms.
This work introduces a method for NP-class combinatorial problems using a vanilla Transformer. By combining Sudoku rules and guesses, the approach achieves SOTA results (99.8%). Solution length is analyzed via the Min-Sum Set Cover problem.