2 papers across 2 sessions
We design a PAC-learner for contextual combinatorial semi-bandits with sparse rewards, with a sample complexity bound that primarily scales with the sparsity parameter rather than the number of arms.
Implicit bias for p-norm normalized steepest descent (NSD) and momentum steepest descent (NMD) algorithms in multi-class linear classification with cross-entropy loss.