2 papers across 2 sessions
AdaSTaR enhances STaR by using adaptive sampling for diversity and curriculum to reduce training data imbalance, achieving best accuracy across six benchmarks while reducing training FLOPs by 58.6%.